githubEdit

Coding Patterns


Scikit-Learn Patterns

1. Basic Pipeline (Prevent Data Leakage)

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', LogisticRegression())
])
pipeline.fit(X_train, y_train)
preds = pipeline.predict(X_test)

2. Grid Search with Cross-Validation

from sklearn.model_selection import GridSearchCV

params = {'clf__C': [0.1, 1, 10], 'clf__penalty': ['l1', 'l2']}
grid = GridSearchCV(pipeline, params, cv=5, scoring='f1')
grid.fit(X_train, y_train)
print(grid.best_params_)

3. Random Search (Faster)

4. Column Transformer (Mixed Types)

5. Stratified K-Fold

6. Class Weight for Imbalance


NumPy Vectorization

7. Euclidean Distance Matrix

8. Normalize Vectors (L2 Norm)

9. Softmax Implementation

10. One-Hot Encoding

11. Argmax with Random Tie-Breaking

12. Moving Average


PyTorch Patterns

13. Basic Training Loop

14. Validation Loop

15. Save and Load Model

16. Custom Dataset

17. Learning Rate Scheduler

18. Early Stopping


Evaluation Code

19. Classification Report

20. ROC-AUC

21. Precision-Recall Curve

22. Feature Importance (Tree Models)


Implement From Scratch

23. K-Means Clustering

24. Logistic Regression (Gradient Descent)

25. Naive Bayes (Gaussian)

26. KNN Classifier


Interview Coding Questions

27. "How would you compute the cosine similarity between two vectors?"

28. "Implement sigmoid function."

29. "Implement binary cross-entropy loss."

30. "Write a function to compute precision, recall, and F1."

Last updated