← 返回首页
GitHub - git-disl/MELT: A Behavioral Trace Dataset for High-Risk Memecoin Launch Detection · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

git-disl/MELT

 master
Go to file
Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
View all files

Repository files navigation

Memecoin Launch Trace Dataset

This repository contains the dataset and code for the submission: "MELT: A Behavioral Trace Dataset for High-Risk Memecoin Launch Detection"

Environment

  • Python 3.9
  • Install packages using:
pip install -r requirements.txt

Part 1: Feature Generation

Step 1: Download Dataset

Since the raw transaction data is very huge (>1TB), we only provide the parsed transaction datasets on Google Drive:

  • pre_migration_tx.zip — Pre-migration (bonding curve) transactions. Required for feature generation. Download and unzip under MELT/data/tx/ (so the parsed transactions sit under data/tx/...).
  • bundle.zip — Bundle trace data. Required for feature generation. Download and unzip under MELT/data/ (expands into data/bundle/).
  • post_migration_tx.zip — Post-migration (Raydium DEX) transactions. Optional, very large, not used by feature generation. Only download if you want to do your own post-migration analysis.

You can skip Step 2 and download our pre-generated feature.pkl directly. Place it at MELT/data/feat/feature.pkl and proceed to Part 2. In this case neither pre_migration_tx.zip nor bundle.zip is needed.

Step 2: Feature Generation

cd MELT/src python feat_gen.py

This generates data/feat/feature.pkl from the pre-migration transactions, bundle trace data, and contextual information.

Part 2: High-risk Launch Detection

Step 1: Train a model

cd MELT/src python train.py --model rf

--model accepts any of: rf, xgb, lgbm, lr, mlp, or tcn, lstm, gru, transformer (time-series models). Prediction CSVs are written to MELT/results/{model}_pred_*.csv.

Common flags:

flag default applies to
--model xgb all
--epochs 20 DNN models
--batch_size 256 DNN models
--lr 1e-3 DNN models
--seed 42 all (Python random, numpy, torch, sklearn random_state, DataLoader shuffle)

train.py only reports AUPRC (threshold-free) and dumps per-run prediction probabilities to results/. Threshold-based metrics (precision / recall / F1) and ensembling are done in the next step.

Step 2: Evaluate predictions

evaluate.py reads a prediction CSV from Step 1 and prints AUPRC plus a classification_report at one or more probability thresholds.

# single threshold python evaluate.py --csv lgbm_pred_0.559999.csv --thresholds 0.5 # multi-threshold sweep python evaluate.py --csv lgbm_pred_0.559999.csv --thresholds 0.3 0.4 0.5 0.6
flag default role
--csv prediction CSV (relative paths resolve against results/)
--thresholds [0.49] one or more probability cutoffs; one classification_report per threshold

src/

file role
feat_gen.py Generates data/feat/feature.pkl from parsed transactions, bundle traces, and contextual info.
dataset.py Data loading & preprocessing. Reads feature.pkl + label CSV, merges by mint_address, splits & scales. Exposes load_dataset(), TSDataset, ts_collate.
model.py All model definitions and factories (sklearn baselines + MLP / TS deep models).
train.py Training entry point. Argparse-driven; reports AUPRC and writes per-run prediction CSVs to results/.
evaluate.py Evaluates prediction CSVs from train.py at one or more thresholds; supports weighted ensembling of multiple CSVs.

License

This project is released under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. See LICENSE for the full text.

You are free to use, share, and adapt the material for non-commercial purposes, provided you give appropriate credit (please cite our paper). Commercial use requires separate permission from the authors.

About

A Behavioral Trace Dataset for High-Risk Memecoin Launch Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

Footer

© 2026 GitHub, Inc.