Site Map

Downloading the data

Back to Tutorials

How do I...

Get the data?

Jump to:

Commands for getting the data


At the moment the data are distributed from the Science Archive Server (SAS), using two main commands, rsync and wget. To browse the available data, you might want to start at the top level.


First, a disclaimer: the full data volume of the DR10 sample is tens of terabytes (see this table for specifics). rsync (and http) are not designed for such large data transfers. If you really do need the full data set, please contact the Help Desk to talk to a data transfer expert who can arrange a custom transfer. This will be faster for you and easier on our servers.

Now, below is a typical rsync command:

rsync -av rsync://[MJD]/ [MJD]/

This particular command will give you an MJD's data.

Getting the data: SEGUE-2

General Description

The data in the SAS (and the mirror, SAM) are "clean" samples of SEGUE-2 data. This definition of "clean" specifically covers two things:

There are a few exceptions to the latest-MJD-is-best rule. See "Notes on individual plates" below.

NOTE that there are "rerun" numbers, 104 for the spectra and 122 for the stellar parameters. Those rerun numbers are used to designate different versions of the pipeline software. If those get changed you will get many obtrusive announcements to tell you about it, but PLEASE do keep track. The rerun number for the spectra hasn't changed since March 2010 (rerun 104), and is the same pipeline that was run for DR7. Rerun 122 for the SSPP is new for DR10, and has been run on all the SEGUE-1, SEGUE-2 and SDSS legacy plates. This rerun of the SSPP has several improvements over the version used for SSPP rerun 116 (from DR8) and rerun 104 (which was used for DR7).

The data model tells you what's what in each file. Easiest is probably to use the file index link at the data model top page.

From the plate subdirectories, the most useful files are the spZbest files (containing parameters measured from the spectra, e.g. redshift, etc) and spPlate files (containing the spectra themselves). For the SSPP outputs, the ssppOut files contain all the stellar parameters, the ssppOut-* files contain the measured line indices, and the ssppOut-*.ps.gz files have the condensed plots for the SSPP diagnostic outputs. There are also SSPP diagnostic plots for each object, with instructions for getting them below.

You can get data from the Science Archive Mirror by substituting for in all the commands below.


To get the spectra you can use the following wget command. Warning, it will take a long time (10 minutes, last time I checked) to download a lot of index.html files before it gets to the actual data. It might hang, though in recent (Feb. 2010) testing of these wget commands this particular one didn't hang. Others did.

wget -N -nv -r -nH -np --cut-dirs=4 -A spPlate"*".fits,spZbest"*".fits \

You can also use rsync. Note it will say "receiving file list ..." for a long time while it sorts out the remote directory structure.

rsync -avzuL --include "*"/--include "*"/spPlate"*" --include "*"/spZbest"*" --exclude "*" \
rsync:// ./

The data are stored in subdirectories by plate number, so this page has the spectroscopic parameter data and this page has the stellar parameters data for plate 3131.

Target and Field (all objects) photometry, astrometry, proper motion, and target selection information

You can also get the photometry, proper motions and other target selection information here. See the data model link again for what is in these files.

To get all the photometric information for the spectroscopic targets:

wget -N -nv -r -nH -np  -A seguetsObjPlate"*" \


rsync -avzuL rsync:// ./