← 返回首页
data-processing · GitHub Topics · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus
#

data-processing

Here are 2,531 public repositories matching this topic...

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

  • Updated May 25, 2026
  • Python

A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.

  • Updated Jan 22, 2026

Unified querying, transformation, and modification of JSON, TOML, YAML, XML, INI, HCL, KDL and CSV.

  • Updated May 22, 2026
  • Go

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

  • Updated May 25, 2026
  • Python

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

  • Updated May 23, 2026
  • C++

A lightweight data processing framework built on DuckDB and 3FS.

  • Updated Mar 5, 2025
  • Python

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

  • Updated May 22, 2026
  • Python

High-performance AI pipeline engine with a C++ core and 50+ Python-extensible nodes. Build, debug, and scale LLM workflows with 13+ model providers, 8+ vector databases, and agent orchestration, all from your IDE. Includes VS Code extension, TypeScript/Python SDKs, and Docker deployment.

  • Updated May 25, 2026
  • C++

The Context Layer for unstructured data: typed, versioned datasets over S3, GCS, Azure

  • Updated May 25, 2026
  • Python

Concurrent and multi-stage data ingestion and data processing with Elixir

  • Updated Apr 17, 2026
  • Elixir

Kubernetes-native platform to run massively parallel data/streaming jobs

  • Updated May 23, 2026
  • Rust

Large-scale pretraining for dialogue

  • Updated Oct 17, 2022
  • Python

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

  • Updated Aug 26, 2021
  • Python

Extract Transform Load for Python 3.5+

  • Updated May 12, 2023
  • Python
Load more…

Improve this page

Add a description, image, and links to the data-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-processing topic, visit your repo's landing page and select "manage topics."

Learn more

Footer

© 2026 GitHub, Inc.