← 返回首页
unstructured-data · GitHub Topics · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus
#

unstructured-data

Here are 245 public repositories matching this topic...

🦉 Data Versioning and ML Experiments

  • Updated May 18, 2026
  • Python

Neo4j graph construction from unstructured data using LLMs

  • Updated May 5, 2026
  • Jupyter Notebook

A system for agentic LLM-powered data processing and ETL

  • Updated May 20, 2026
  • Python

The Context Layer for unstructured data: typed, versioned datasets over S3, GCS, Azure

  • Updated May 25, 2026
  • Python

Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.

  • Updated Apr 20, 2026
  • Jupyter Notebook

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications

  • Updated May 20, 2026
  • Python

Nomic Developer API SDK

  • Updated Nov 11, 2025
  • Python

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

  • Updated Dec 21, 2024
  • Rust

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

  • Updated May 8, 2026
  • Java

AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code

  • Updated Apr 30, 2026
  • Python

Get clean data from tricky documents, powered by vision-language models ⚡

  • Updated Mar 25, 2026
  • Python

Humans and AI agents, building knowledge bases together. Self-hosted document annotation, version control, semantic search, and MCP.

  • Updated May 25, 2026
  • Python

Interactively explore unstructured datasets from your dataframe.

  • Updated May 23, 2026
  • TypeScript

Curate better data for LLMs

  • Updated Mar 19, 2024
  • Python
Load more…

Improve this page

Add a description, image, and links to the unstructured-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the unstructured-data topic, visit your repo's landing page and select "manage topics."

Learn more

Footer

© 2026 GitHub, Inc.