Day 19-21: Specialized (NLP/CV)

Executive Summary

Domain

Core Task

Standard Feature/Arch

NLP

Text Classification/Translation

Embeddings, Transformers, attention

Image Classification/Detection

Convolutions (CNN), Pooling

Tabular

Forecasting/Fraud

XGBoost, LightGBM

文本 1. Natural Language Processing (NLP)

Text Representation

Bag-of-Words (BoW): Counting word frequencies. High-dim, sparse.
TF-IDF: Weighs words by importance ($TF \times IDF$).
Embeddings (Word2Vec/GloVe): Dense vectors where relationship is captured by distance.

Modern NLP: Transformers

Self-Attention: Allows the model to weigh different words in a sentence differently when processing a token.
BERT: Encoder-only, good for understanding (Sentiment, NER).
GPT: Decoder-only, good for generation.

2. Computer Vision (CV)

Convolutional Neural Networks (CNNs)

Filters/Kernels: Learn features like edges, textures, and patterns.
Pooling: Reduces spatial dimensions (Max Pooling is most common).
Invariance: CNNs allow for translation invariance (an ear is an ear, wherever it is in the picture).

Vision Transformers (ViT)

Recent breakthrough: Applying the Transformer architecture to patches of images instead of pixels.

Interview Questions

1. "What is TF-IDF and why use it over simple counts?"

TF (Term Frequency) measures how often a word appears in a doc. IDF (Inverse Document Frequency) penalizes common words (like "the", "is"). TF-IDF highlights words that are unique/important to a specific document.

2. "Why are CNNs better than MLPs for images?"

MLPs have too many parameters (fully connected) and don't capture spatial locality. CNNs use shared weights (filters) and translation invariance, making them much more efficient and effective for visual patterns.

3. "What is Data Augmentation in the context of CV?"

Artificially increasing the dataset by applying transformations (flips, rotations, crops, color shifts) to the original images. This helps the model generalize better and reduces overfitting.

NLP Quick-Check

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(corpus)

PreviousDay 17-18: Hyperparameter Tuning NextWeek 4: System Design & Strategy

Last updated 21 days ago

hashtagExecutive Summary

hashtag文本 1. Natural Language Processing (NLP)

hashtagText Representation

hashtagModern NLP: Transformers

hashtag2. Computer Vision (CV)

hashtagConvolutional Neural Networks (CNNs)

hashtagVision Transformers (ViT)

hashtagInterview Questions

hashtagNLP Quick-Check