Skip to content

khuyentran1401/codecut-blog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeCut Blog Articles

Visit CodeCut Blog

About CodeCut

These notebooks are from CodeCut. CodeCut features open-source Python data science tools explained in clear, digestible tutorials. Subscribe to get:

  • Weekly articles with step-by-step guides
  • Newsletters 3x per week (2-minute digests)

Repository Overview

This repository contains 45+ comprehensive technical articles covering data science, MLOps, and AI tools.

Here are some examples of what you'll find in this repository:

Data Engineering

  • PySpark SQL - DataFrames, window functions, aggregations
  • DuckDB - Fast analytical queries for data scientists
  • DVC - Data versioning and experiment tracking
  • Delta Lake - Production lakehouses with delta-rs

Machine Learning

LLM Applications

Data Visualization

Data Utilities

  • Faker - Generate realistic test data
  • PRegEx - Human-readable regex patterns
  • Loguru - Simplified Python logging
  • Hydra - Configuration management

Setup

Prerequisites: Python 3.9+

Quick Start:

# Clone repository
git clone https://github.com/khuyentran1401/codecut-blog.git
cd codecut-blog

# Install dependencies (listed at top of each notebook)
pip install package1 package2

Use UV for faster installs: uv pip install package1 package2

License

All articles are copyright � Khuyen Tran. Code examples within articles are MIT licensed for reuse.

About

45+ production-ready tutorials on data science, MLOps, and AI tools. All code is executable and adaptable for real projects.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors