You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@hiroyuki-kasai Thank you for creating this project with a wide variety of algorithms.
I went through the code in linear_regression_data_generator.m and was not quite clear how the data is being generated. Also, on running the code I find that all my rows have the same number. Can you explain how the dataset is generated for linear regression?
% set number of dimensions d = 50; % set number of samples n = 7000; % generate data std = 0.25 data = linear_regression_data_generator(n, d, std);
Attached below is the data(x_train) and label(y_train) generated
As you see the code of linear_regression_data_generator.m, the 'w (=w_opt)' to be solved in the regression problem is set as
w_opt = 0.5 * ones(d+1, 1);
If d = 1, which corresponds to the 2-dimensional case, y = [w_1, w_2]' * [x_1, x_2(=1)], where w_1 is the slope of the line and w_2 is the intersection to the y-axis. Therefore, this case is exactly
y = 1/2*x + 1/2.
When d = 2; we get
z = 1/2x + 1/2y + 1/2.
you can check this case as below;
close all
clear
clc
n = 1000;
d = 2;
std = 0.1;
data = linear_regression_data_generator(n, d, std);
x = data.x_train(1,:);
y = data.x_train(2,:);
z = data.y_train;
figure
% plot z = 1/2x + 1/2y + 0.5;
scatter3(x, y, z); hold on
% plot the intersection point (0, 0, 0.5)
plot3(0, 0, 0.5, 'ro','MarkerSize', 20, 'MarkerFaceColor', 'red'); hold off
xlabel('x')
ylabel('y')
zlabel('z')
That is why the all rows are the same except the last one that correspond to the intersection.
This behavior comes from how to set w_opt. You would change the way of setting the ideal value of w_opt as you like, then you get different datasets.
Hello,
@hiroyuki-kasai Thank you for creating this project with a wide variety of algorithms.
I went through the code in linear_regression_data_generator.m and was not quite clear how the data is being generated. Also, on running the code I find that all my rows have the same number. Can you explain how the dataset is generated for linear regression?
% set number of dimensions
d = 50;
% set number of samples
n = 7000;
% generate data
std = 0.25
data = linear_regression_data_generator(n, d, std);
Attached below is the data(x_train) and label(y_train) generated
data.xlsx
label.xlsx
Thank you!
The text was updated successfully, but these errors were encountered: