geofetch is a command-line tool that downloads and organizes data and metadata from GEO and SRA. When given one or more GEO/SRA accessions,
- Download either raw or processed data from either SRA or GEO
- Produce a standardized PEP sample table. This makes it really easy to run looper-compatible pipelines on public datasets by handling data acquisition and metadata formatting and standardization for you.
- Prepare a project to run with sraconvert to convert SRA files into FASTQ files.
Key geofetch advantages:
- Works with GEO and SRA metadata
- Combines samples from different projects
- Standardizes output metadata
- Filters type and size of processed files (from GEO) before downloading them
- Easy to use
- Fast execution time
- Can search GEO to find relevant data
- Can be used either as a command-line tool or from within Python using an API
geofetch runs on the command line. This command will download the raw data and metadata for the given GSE number.
geofetch -i GSE95654
You can add
--processed if you want to download processed files from the given experiment.
geofetch -i GSE95654 --processed
You can add
--just-metadata if you want to download metadata without the raw SRA files or processed GEO files.
geofetch -i GSE95654 --just-metadata
geofetch -i GSE95654 --processed --just-metadata
Check out what exactly argument you want to use to download data:
New features available in geofetch 0.11.0:
1) Now geofetch is available as Python API package. Geofetch can initialize peppy projects without downloading any soft files. Example:
from geofetch import Geofetcher # initiate Geofetcher with all necessary arguments: geof = Geofetcher(processed=True, acc_anno=True, discard_soft=True) # get projects by providing as input GSE or file with GSEs geof.get_projects("GSE160204")
2) Now to find GSEs and save them to file you can use
Finder - GSE finder tool:
from geofetch import Finder # initiate Finder (use filters if necessary) find_gse = Finder(filters='bed') # get all projects that were found: gse_list = find_gse.get_gse_all()
Find more information here: GSE Finder