Disk Space

For users interested in downloading AbacusSummit data, this breakdown of the file sizes of the various data products may be helpful.

The following is an annotated summary of du -BMB (so units of 1,000,000 bytes, as disk drives use). This is for AbacusSummit_base_c000_ph008, which is typical, supplemented by AbacusSummit_base_c000_ph001 for some information about the full time slices.

# The total outputs are 7.85 TB per simulation, if there are no full time slices.
7851475 AbacusSummit_base_c000_ph008

# When there are full time slices, they are about 3.5 TB *each*,
# with a mild trend in redshift, due to slight changes in compression efficiency.
# Sims have between 0 and 12 full time-slice epochs.
3578826   slices/z0.200
3534315   slices/z0.800
3498498   slices/z1.400
3472193   slices/z2.000
3451782   slices/z2.500
3431805   slices/z3.000

# Here's the breakdown; most of the data is in the halos.
# Most users won't need to look at the logs.
1       [top-level directory]
1      info
5649    log
1274121 lightcones
6571688 halos

# The lightcones contain 10% of the particles in rv+pid, and 100% in heal.
# rv: particle pos/vel
# pid: can be used to connect particles in the lightcone to halos in the timeslices;
#      also contains the kernel density estimate from the fully sampled density field.
# heal: The Nside=16384, ring=True healpix projection
# These are broken into files for each time step, typically d(ln(1+z)) ~ 0.002.
703830  lightcones/rv
329456  lightcones/heal
240836  lightcones/pid

# The halos come in two flavors: 12 primary time slices with more outputs,
# and then 21 secondary time slices with only limited outputs.
# Most users will only want the primary slices; the secondary ones
# are intended for merger trees and for matching halos onto the light cones.
#
# The primary slices are bigger, typically 400 GB each:
412661  halos/z0.100
412863  halos/z0.200
412748  halos/z0.300
412340  halos/z0.400
411654  halos/z0.500
408158  halos/z0.800
402861  halos/z1.100
396109  halos/z1.400
388176  halos/z1.700
379538  halos/z2.000
364212  halos/z2.500
348863  halos/z3.000
# Total of primary slices: 4750 GB

# The secondary slices are smaller, typically 100 GB and decreasing at high z:
111493  halos/z0.150
110879  halos/z0.250
110041  halos/z0.350
109006  halos/z0.450
107437  halos/z0.575
106324  halos/z0.650
105147  halos/z0.725
102498  halos/z0.875
101105  halos/z0.950
99575   halos/z1.025
96330   halos/z1.175
94619   halos/z1.250
92831   halos/z1.325
89070   halos/z1.475
87182   halos/z1.550
85311   halos/z1.625
79098   halos/z1.850
67995   halos/z2.250
53962   halos/z2.750
11267   halos/z5.000
351    halos/z8.000
# Total of secondary slices: 1822 GB

# But one might opt only to keep certain types of files on disk, so here is the summary
# of the types.

# For example, a minimal installation might be only the halo_info and halo_rv_A files,
# which are 1.1 TB, and perhaps only for some of the primary epochs.  E.g., five
# epochs might knock the storage down to 0.5 TB per simulation.

# Another example would be to install only the primary epochs, without the
# field B samples.  This saves 1.83 TB per sim, so one is at 6 TB/sim.

# The halo info files contain the stats about all halos, typically 70-75 GB/epoch.
# Note that the file format supports reading only subsets of columns; many users
# will need to load only a small fraction of these.
73188   halos/z0.100 halo_info
74301   halos/z0.200 halo_info
75190   halos/z0.300 halo_info
75864   halos/z0.400 halo_info
76323   halos/z0.500 halo_info
76416   halos/z0.800 halo_info
74722   halos/z1.100 halo_info
71488   halos/z1.400 halo_info
67014   halos/z1.700 halo_info
61615   halos/z2.000 halo_info
51419   halos/z2.500 halo_info
40797   halos/z3.000 halo_info
# Primary halo_info: 818 GB

73764   halos/z0.150 halo_info
74763   halos/z0.250 halo_info
75550   halos/z0.350 halo_info
76107   halos/z0.450 halo_info
76508   halos/z0.575 halo_info
76591   halos/z0.650 halo_info
76548   halos/z0.725 halo_info
76120   halos/z0.875 halo_info
75755   halos/z0.950 halo_info
75270   halos/z1.025 halo_info
74008   halos/z1.175 halo_info
73240   halos/z1.250 halo_info
72376   halos/z1.325 halo_info
70387   halos/z1.475 halo_info
69312   halos/z1.550 halo_info
68204   halos/z1.625 halo_info
64279   halos/z1.850 halo_info
56553   halos/z2.250 halo_info
45891   halos/z2.750 halo_info
10066   halos/z5.000 halo_info
303    halos/z8.000 halo_info
# Secondary halo_info: 1362 GB

# The particles associated to these halos, with 3% consistent subsample
# in A and 7% in B, all indexed out of the halo_info files.
# First, we have the positions and velocities.
# Users painting HOD satellite galaxies into the halos probably could just use the A set.
31415   halos/z0.100 halo_rv_A
30859   halos/z0.200 halo_rv_A
30213   halos/z0.300 halo_rv_A
29498   halos/z0.400 halo_rv_A
28733   halos/z0.500 halo_rv_A
26264   halos/z0.800 halo_rv_A
23699   halos/z1.100 halo_rv_A
21216   halos/z1.400 halo_rv_A
18767   halos/z1.700 halo_rv_A
16457   halos/z2.000 halo_rv_A
12965   halos/z2.500 halo_rv_A
9965    halos/z3.000 halo_rv_A
# Primary 3% halo rv: 280 GB

70448   halos/z0.100 halo_rv_B
69199   halos/z0.200 halo_rv_B
67735   halos/z0.300 halo_rv_B
66123   halos/z0.400 halo_rv_B
64399   halos/z0.500 halo_rv_B
58825   halos/z0.800 halo_rv_B
53036   halos/z1.100 halo_rv_B
47321   halos/z1.400 halo_rv_B
41824   halos/z1.700 halo_rv_B
36626   halos/z2.000 halo_rv_B
28821   halos/z2.500 halo_rv_B
22132   halos/z3.000 halo_rv_B
# Primary 7% halo rv: 626 GB

# And then we have the PIDs, with kernel densities embedded.
# These are used build merger trees, but could also be used
# to track particular particles as part of galaxy assignment,
# e.g., to find the densest particle in a progenitor halo and
# use its late-time position.
13680   halos/z0.100 halo_pid_A
13325   halos/z0.200 halo_pid_A
12947   halos/z0.300 halo_pid_A
12548   halos/z0.400 halo_pid_A
12140   halos/z0.500 halo_pid_A
10870   halos/z0.800 halo_pid_A
9652    halos/z1.100 halo_pid_A
8493    halos/z1.400 halo_pid_A
7421    halos/z1.700 halo_pid_A
6442    halos/z2.000 halo_pid_A
4989    halos/z2.500 halo_pid_A
3777    halos/z3.000 halo_pid_A
# Primary 3% halo pid: 116 GB

30217   halos/z0.100 halo_pid_B
29390   halos/z0.200 halo_pid_B
28501   halos/z0.300 halo_pid_B
27584   halos/z0.400 halo_pid_B
26637   halos/z0.500 halo_pid_B
23780   halos/z0.800 halo_pid_B
21020   halos/z1.100 halo_pid_B
18439   halos/z1.400 halo_pid_B
16061   halos/z1.700 halo_pid_B
13899   halos/z2.000 halo_pid_B
10731   halos/z2.500 halo_pid_B
8097    halos/z3.000 halo_pid_B
# Primary 7% halo pid: 254 GB

# For the secondary epochs, we provide only the PID+density file.
# These are slightly smaller because they include only L1 particles, not L0 particles.
11726   halos/z0.150 halo_pid_A
11238   halos/z0.250 halo_pid_A
10743   halos/z0.350 halo_pid_A
10256   halos/z0.450 halo_pid_A
9655    halos/z0.575 halo_pid_A
9292    halos/z0.650 halo_pid_A
8940    halos/z0.725 halo_pid_A
8258    halos/z0.875 halo_pid_A
7941    halos/z0.950 halo_pid_A
7620    halos/z1.025 halo_pid_A
7007    halos/z1.175 halo_pid_A
6716    halos/z1.250 halo_pid_A
6429    halos/z1.325 halo_pid_A
5883    halos/z1.475 halo_pid_A
5628    halos/z1.550 halo_pid_A
5389    halos/z1.625 halo_pid_A
4680    halos/z1.850 halo_pid_A
3621    halos/z2.250 halo_pid_A
2563    halos/z2.750 halo_pid_A
395    halos/z5.000 halo_pid_A
16     halos/z8.000 halo_pid_A
# Secondary 3% halo pid: 144 GB

26003   halos/z0.150 halo_pid_B
24879   halos/z0.250 halo_pid_B
23749   halos/z0.350 halo_pid_B
22643   halos/z0.450 halo_pid_B
21274   halos/z0.575 halo_pid_B
20442   halos/z0.650 halo_pid_B
19659   halos/z0.725 halo_pid_B
18120   halos/z0.875 halo_pid_B
17410   halos/z0.950 halo_pid_B
16685   halos/z1.025 halo_pid_B
15316   halos/z1.175 halo_pid_B
14664   halos/z1.250 halo_pid_B
14026   halos/z1.325 halo_pid_B
12802   halos/z1.475 halo_pid_B
12243   halos/z1.550 halo_pid_B
11719   halos/z1.625 halo_pid_B
10139   halos/z1.850 halo_pid_B
7822    halos/z2.250 halo_pid_B
5509    halos/z2.750 halo_pid_B
806    halos/z5.000 halo_pid_B
34     halos/z8.000 halo_pid_B
# Secondary 7% halo pid: 316 GB

# We provide the rest of the density field, i.e., the complement of the halo set,
# in the subsamples.  These would be used in matter-field statistics or if one wanted
# to associate particles in the periphery of halos.  Or if one wanted to run a different
# group finder (admittedly on only 10% of the dynamical particles).
43432   halos/z0.100 field_rv_A
44044   halos/z0.200 field_rv_A
44724   halos/z0.300 field_rv_A
45445   halos/z0.400 field_rv_A
46195   halos/z0.500 field_rv_A
48542   halos/z0.800 field_rv_A
50898   halos/z1.100 field_rv_A
53164   halos/z1.400 field_rv_A
55295   halos/z1.700 field_rv_A
57277   halos/z2.000 field_rv_A
60189   halos/z2.500 field_rv_A
62610   halos/z3.000 field_rv_A
# Primary 3% field rv: 612 GB

97497   halos/z0.100 field_rv_B
98866   halos/z0.200 field_rv_B
100381  halos/z0.300 field_rv_B
101991  halos/z0.400 field_rv_B
103666  halos/z0.500 field_rv_B
108916  halos/z0.800 field_rv_B
114209  halos/z1.100 field_rv_B
119330  halos/z1.400 field_rv_B
124160  halos/z1.700 field_rv_B
128634  halos/z2.000 field_rv_B
135233  halos/z2.500 field_rv_B
140699  halos/z3.000 field_rv_B
# Primary 7% field rv: 1374 GB

# Field PIDs are probably not used much, but these do relate particles across epochs
# and the PID encodes the initial grid location for Lagrangian displacements.
16508   halos/z0.100 field_pid_A
16556   halos/z0.200 field_pid_A
16627   halos/z0.300 field_pid_A
16709   halos/z0.400 field_pid_A
16800   halos/z0.500 field_pid_A
17124   halos/z0.800 field_pid_A
17468   halos/z1.100 field_pid_A
17809   halos/z1.400 field_pid_A
18128   halos/z1.700 field_pid_A
18443   halos/z2.000 field_pid_A
18833   halos/z2.500 field_pid_A
19093   halos/z3.000 field_pid_A
# Primary 3% field pid: 210 GB

36280   halos/z0.100 field_pid_B
36327   halos/z0.200 field_pid_B
36434   halos/z0.300 field_pid_B
36582   halos/z0.400 field_pid_B
36765   halos/z0.500 field_pid_B
37426   halos/z0.800 field_pid_B
38160   halos/z1.100 field_pid_B
38855   halos/z1.400 field_pid_B
39510   halos/z1.700 field_pid_B
40150   halos/z2.000 field_pid_B
41037   halos/z2.500 field_pid_B
41698   halos/z3.000 field_pid_B
# Primary 7% field pid: 459 GB

# For the full time slices, they are split into L0 and field (non-L0) sets.
# However, this was just due to convenience in the code; the L0 particles
# are not indexed in halo_info.  Only concatenations will be useful.
# The fractional split of L0 to field increases to low redshift.
#
# The position+velocity data is in the pack9 format, which gives
# somewhat higher precision than RVint.  These files average about
# 2.8 TB per epoch, which is 8.5 bytes per particle.
1682068   slices/z0.200/field_pack9
1083440   slices/z0.200/L0_pack9

1864771   slices/z0.800/field_pack9
925909    slices/z0.800/L0_pack9

2061023   slices/z1.400/field_pack9
748046    slices/z1.400/L0_pack9

2242314   slices/z2.000/field_pack9
581440    slices/z2.000/L0_pack9

2371831   slices/z2.500/field_pack9
458272    slices/z2.500/L0_pack9

2480799   slices/z3.000/field_pack9
352132    slices/z3.000/L0_pack9

# The PID and kernel density estimate are in the pid files.
# These average about 0.7 TB per epoch, increasing toward low redshift.
510862    slices/z3.000/field_pack9_pid
88014     slices/z3.000/L0_pack9_pid

501868    slices/z2.500/field_pack9_pid
119812    slices/z2.500/L0_pack9_pid

489859    slices/z2.000/field_pack9_pid
158581    slices/z2.000/L0_pack9_pid

472529    slices/z1.400/field_pack9_pid
216902    slices/z1.400/L0_pack9_pid

454954    slices/z0.800/field_pack9_pid
288682    slices/z0.800/L0_pack9_pid

444629    slices/z0.200/field_pack9_pid
368691    slices/z0.200/L0_pack9_pid