5th jun '25

this week's goals

finish blog homepage functionality

✔ finish 4 pages of sketchbook

✔ datacamp course 24

✔ datacamp course 26

✔ design christmas fire on marz blindbox set

today's checklist

✔ datacamp course: introduction to regression with statsmodels in python

✔ catch up with blogs: 29th, 30th, 31st may and yesterday (jun 4)

design blind box for christmas

digitalise fire on marz main 9 character profiles

✔ language dailies

tags

backend
frontend
languages

everything but art

ohayou! bonjour! guten morgen!

started the day earlier than usual - 09.09!

managed to finish my language dailies in less than half an hour.

french: elementary a2 - 1.7 developing fluency
french	english
J'ai fait le ménage et après, j'ai regardée la télé.	I did some housework and then I watched TV.
Je suis restée à la maison pendant le week-end.	I stayed home over the weekend.
Je suis sortie avec des amis toute la soirée.	I went out with friends all evening.

german: elementary a2 - 1.5 developing fluency
german	english
Das Wetter war schön. Ich war im Park.	The weather was nice. I was in the park.
Ich hatte Besuch. Meine Schwester war hier.	I had a visitor. My sister was here.
Ich war am Wochenende im Park.	I was in the park at the weekend.
Ich war kaputt. Ich war zu Hause.	I was knackered. I was at home.

regression? like, in orv?

i finished the datacamp course 'introduction to regression with statsmodels in python' just in time for lunch (12.27). here are my notes for the course:

chapter 1: simple linear regression modeling

regression
- statistical models that help explore relationship between a response variable and explanatory variables
- if given the values of explanatory variables, values of response variables can be predicted
response variable
- y variable
- dependent variable
- the variable you want to predict
explanatory variable
- x variable
- independent variable
- variables that explain how the response variable will change
linear regression
- when the response variable is numeric
logistic regression
- when the response variable is logical (either True or False)
simple linear / logistics regression
- only one explanatory variable
a scatterplot can be used to visualise pairs of variables
python packages for regression
- statsmodels - for insight
- scikit-learn - for prediction
a histogram can be used to visualise the relationship between numerical and categorical variables

chapter 2: predictions and model objects

the predicting question: if i set the explanatory variables to these values, what value would the response variable have?
extrapolating
- making predictions outside the range of observed data
fitted values (.fittedvalues attribute) - predictions on original dataset
residuals (.resid attribute) - actual response values minus predicted response values (how much the model missed by)
response value = fitted value + residual
regression to the mean
- when residuals exist due to problems in model and fundamental randomness
- extreme cases are often due to randomness
- extreme cases don’t persist over time - will eventually look like average cases
to fit a linear regression model, may need to transform the explanatory or response variable if they do not give a straight line

chapter 3: assessing model fit

coefficient of determination (r-squared or R-squared)
- the proportion of the variance in the response variable that is predictable from the explanatory variable
- 1 is a perfect fit
- 0 means worst possible fit
- correlation squared - for simple linear regression
residual standard error
- roughly a measure of the typical size of the residuals - how much the predictions are typically wrong
leverage - a measure of how extreme the explanatory variable values are
influence - measures how much the model would change if you left observation out of the dataset when modeling

chapter 4: simple logistics regression modeling

logistic regression
- type of generalised linear model
- user when the response variable is logical
- follows s-shaped curve
odds ratio - the probability of something happening divided by the probability that is doesn’t
- odds_ratio = probability / (1-probability)
four outcomes to a logical response variable
- predicted false, actual false - correct
- predicted false, actual true - false negative
- predicted true, actual false - false positive
- predicted true, actual true - correct
- a confusion matrix - the counts of each outcome
accuracy of model
- the proportion of correct predictions
- accuracy = tn+tp / tn+fn+fp+tp
sensitivity of model
- the proportion of true positives
- accuracy = tp / fn+tp
specificity of model
- the proportion of true negatives
- tn / tn+fp

blog checkpoints

today's catchup for blog posts was for the 29th, 30th, and 31st may, as well as yesterday's blog, the 4th june. here are the times i was able to finish them, starting from about 16.30:

17.19 finished 29th may blog page
18.11 finished 30th may blog page
18.32 finished 31st may blog page
19.09 finished yesterday's blog page!!