everything but art
ohayou! bonjour! guten morgen!
started the day earlier than usual - 09.09!
managed to finish my language dailies in less than half an hour.
french: elementary a2 - 1.7 developing fluency
french |
english |
J'ai fait le ménage et après, j'ai regardée la télé. |
I did some housework and then I watched TV. |
Je suis restée à la maison pendant le week-end. |
I stayed home over the weekend. |
Je suis sortie avec des amis toute la soirée. |
I went out with friends all evening. |
german: elementary a2 - 1.5 developing fluency
german |
english |
Das Wetter war schön. Ich war im Park. |
The weather was nice. I was in the park. |
Ich hatte Besuch. Meine Schwester war hier. |
I had a visitor. My sister was here. |
Ich war am Wochenende im Park. |
I was in the park at the weekend. |
Ich war kaputt. Ich war zu Hause. |
I was knackered. I was at home. |
regression? like, in orv?
i finished the datacamp course 'introduction to regression with statsmodels in python' just in time for lunch (12.27).
here are my notes for the course:
chapter 1: simple linear regression modeling
-
regression
-
statistical models that help explore relationship between a response variable and explanatory variables
-
if given the values of explanatory variables, values of response variables can be predicted
-
response variable
-
y variable
-
dependent variable
-
the variable you want to predict
-
explanatory variable
-
x variable
-
independent variable
-
variables that explain how the response variable will change
-
linear regression
-
when the response variable is numeric
-
logistic regression
-
when the response variable is logical (either True or False)
-
simple linear / logistics regression
-
only one explanatory variable
-
a scatterplot can be used to visualise pairs of variables
-
python packages for regression
-
statsmodels - for insight
-
scikit-learn - for prediction
-
a histogram can be used to visualise the relationship between numerical and categorical variables
chapter 2: predictions and model objects
-
the predicting question: if i set the explanatory variables to these values, what value would the response variable have?
-
extrapolating
-
making predictions outside the range of observed data
-
fitted values (.fittedvalues attribute) - predictions on original dataset
-
residuals (.resid attribute) - actual response values minus predicted response values (how much the model missed by)
-
response value = fitted value + residual
-
regression to the mean
-
when residuals exist due to problems in model and fundamental randomness
-
extreme cases are often due to randomness
-
extreme cases don’t persist over time - will eventually look like average cases
-
to fit a linear regression model, may need to transform the explanatory or response variable if they do not give a straight line
chapter 3: assessing model fit
-
coefficient of determination (r-squared or R-squared)
-
the proportion of the variance in the response variable that is predictable from the explanatory variable
-
1 is a perfect fit
-
0 means worst possible fit
-
correlation squared - for simple linear regression
-
residual standard error
-
roughly a measure of the typical size of the residuals - how much the predictions are typically wrong
-
leverage - a measure of how extreme the explanatory variable values are
-
influence - measures how much the model would change if you left observation out of the dataset when modeling
chapter 4: simple logistics regression modeling
-
logistic regression
-
type of generalised linear model
-
user when the response variable is logical
-
follows s-shaped curve
-
odds ratio - the probability of something happening divided by the probability that is doesn’t
-
odds_ratio = probability / (1-probability)
-
four outcomes to a logical response variable
-
predicted false, actual false - correct
-
predicted false, actual true - false negative
-
predicted true, actual false - false positive
-
predicted true, actual true - correct
-
a confusion matrix - the counts of each outcome
-
accuracy of model
-
the proportion of correct predictions
-
accuracy = tn+tp / tn+fn+fp+tp
-
sensitivity of model
-
the proportion of true positives
-
accuracy = tp / fn+tp
-
specificity of model
-
the proportion of true negatives
-
tn / tn+fp
blog checkpoints
today's catchup for blog posts was for the 29th, 30th, and 31st may, as well as yesterday's blog, the 4th june.
here are the times i was able to finish them, starting from about 16.30:
-
17.19 finished 29th may blog page
-
18.11 finished 30th may blog page
-
18.32 finished 31st may blog page
-
19.09 finished yesterday's blog page!!