Disk Space

Disk Space#

For users interested in downloading AbacusSummit data, this breakdown of the file sizes of the various data products may be helpful.

The following is an annotated summary of du -BMB (so units of 1,000,000 bytes, as disk drives use). This is for AbacusSummit_base_c000_ph008, which is typical, supplemented by AbacusSummit_base_c000_ph001 for some information about the full time slices.

# The total outputs are 7.85 TB per simulation, if there are no full time slices.
7851475 AbacusSummit_base_c000_ph008

# When there are full time slices, they are about 3.5 TB *each*,
# with a mild trend in redshift, due to slight changes in compression efficiency.
# Sims have between 0 and 12 full time-slice epochs.
3578826   slices/z0.200
3534315   slices/z0.800
3498498   slices/z1.400
3472193   slices/z2.000
3451782   slices/z2.500
3431805   slices/z3.000

# Here's the breakdown; most of the data is in the halos.
# Most users won't need to look at the logs.
1       [top-level directory]
1      info
5649    log
1274121 lightcones
6571688 halos

# The lightcones contain 10% of the particles in rv+pid, and 100% in heal.
# rv: particle pos/vel
# pid: can be used to connect particles in the lightcone to halos in the timeslices;
#      also contains the kernel density estimate from the fully sampled density field.
# heal: The Nside=16384, ring=True healpix projection
# These are broken into files for each time step, typically d(ln(1+z)) ~ 0.002.
703830  lightcones/rv
329456  lightcones/heal
240836  lightcones/pid

# The halos come in two flavors: 12 primary time slices with more outputs,
# and then 21 secondary time slices with only limited outputs.
# Most users will only want the primary slices; the secondary ones
# are intended for merger trees and for matching halos onto the light cones.
#
# The primary slices are bigger, typically 400 GB each:
412661  halos/z0.100
412863  halos/z0.200
412748  halos/z0.300
412340  halos/z0.400
411654  halos/z0.500
408158  halos/z0.800
402861  halos/z1.100
396109  halos/z1.400
388176  halos/z1.700
379538  halos/z2.000
364212  halos/z2.500
348863  halos/z3.000
# Total of primary slices: 4750 GB

# The secondary slices are smaller, typically 100 GB and decreasing at high z:
111493  halos/z0.150
110879  halos/z0.250
110041  halos/z0.350
109006  halos/z0.450
107437  halos/z0.575
106324  halos/z0.650
105147  halos/z0.725
102498  halos/z0.875
101105  halos/z0.950
99575   halos/z1.025
96330   halos/z1.175
94619   halos/z1.250
92831   halos/z1.325
89070   halos/z1.475
87182   halos/z1.550
85311   halos/z1.625
79098   halos/z1.850
67995   halos/z2.250
53962   halos/z2.750
11267   halos/z5.000
351    halos/z8.000
# Total of secondary slices: 1822 GB

# But one might opt only to keep certain types of files on disk, so here is the summary
# of the types.

# For example, a minimal installation might be only the halo_info and halo_rv_A files,
# which are 1.1 TB, and perhaps only for some of the primary epochs.  E.g., five
# epochs might knock the storage down to 0.5 TB per simulation.

# Another example would be to install only the primary epochs, without the
# field B samples.  This saves 1.83 TB per sim, so one is at 6 TB/sim.

# The halo info files contain the stats about all halos, typically 70-75 GB/epoch.
# Note that the file format supports reading only subsets of columns; many users
# will need to load only a small fraction of these.
73188   halos/z0.100 halo_info
74301   halos/z0.200 halo_info
75190   halos/z0.300 halo_info
75864   halos/z0.400 halo_info
76323   halos/z0.500 halo_info
76416   halos/z0.800 halo_info
74722   halos/z1.100 halo_info
71488   halos/z1.400 halo_info
67014   halos/z1.700 halo_info
61615   halos/z2.000 halo_info
51419   halos/z2.500 halo_info
40797   halos/z3.000 halo_info
# Primary halo_info: 818 GB

73764   halos/z0.150 halo_info
74763   halos/z0.250 halo_info
75550   halos/z0.350 halo_info
76107   halos/z0.450 halo_info
76508   halos/z0.575 halo_info
76591   halos/z0.650 halo_info
76548   halos/z0.725 halo_info
76120   halos/z0.875 halo_info
75755   halos/z0.950 halo_info
75270   halos/z1.025 halo_info
74008   halos/z1.175 halo_info
73240   halos/z1.250 halo_info
72376   halos/z1.325 halo_info
70387   halos/z1.475 halo_info
69312   halos/z1.550 halo_info
68204   halos/z1.625 halo_info
64279   halos/z1.850 halo_info
56553   halos/z2.250 halo_info
45891   halos/z2.750 halo_info
10066   halos/z5.000 halo_info
303    halos/z8.000 halo_info
# Secondary halo_info: 1362 GB

# The particles associated to these halos, with 3% consistent subsample
# in A and 7% in B, all indexed out of the halo_info files.
# First, we have the positions and velocities.
# Users painting HOD satellite galaxies into the halos probably could just use the A set.
31415   halos/z0.100 halo_rv_A
30859   halos/z0.200 halo_rv_A
30213   halos/z0.300 halo_rv_A
29498   halos/z0.400 halo_rv_A
28733   halos/z0.500 halo_rv_A
26264   halos/z0.800 halo_rv_A
23699   halos/z1.100 halo_rv_A
21216   halos/z1.400 halo_rv_A
18767   halos/z1.700 halo_rv_A
16457   halos/z2.000 halo_rv_A
12965   halos/z2.500 halo_rv_A
9965    halos/z3.000 halo_rv_A
# Primary 3% halo rv: 280 GB

70448   halos/z0.100 halo_rv_B
69199   halos/z0.200 halo_rv_B
67735   halos/z0.300 halo_rv_B
66123   halos/z0.400 halo_rv_B
64399   halos/z0.500 halo_rv_B
58825   halos/z0.800 halo_rv_B
53036   halos/z1.100 halo_rv_B
47321   halos/z1.400 halo_rv_B
41824   halos/z1.700 halo_rv_B
36626   halos/z2.000 halo_rv_B
28821   halos/z2.500 halo_rv_B
22132   halos/z3.000 halo_rv_B
# Primary 7% halo rv: 626 GB

# And then we have the PIDs, with kernel densities embedded.
# These are used build merger trees, but could also be used
# to track particular particles as part of galaxy assignment,
# e.g., to find the densest particle in a progenitor halo and
# use its late-time position.
13680   halos/z0.100 halo_pid_A
13325   halos/z0.200 halo_pid_A
12947   halos/z0.300 halo_pid_A
12548   halos/z0.400 halo_pid_A
12140   halos/z0.500 halo_pid_A
10870   halos/z0.800 halo_pid_A
9652    halos/z1.100 halo_pid_A
8493    halos/z1.400 halo_pid_A
7421    halos/z1.700 halo_pid_A
6442    halos/z2.000 halo_pid_A
4989    halos/z2.500 halo_pid_A
3777    halos/z3.000 halo_pid_A
# Primary 3% halo pid: 116 GB

30217   halos/z0.100 halo_pid_B
29390   halos/z0.200 halo_pid_B
28501   halos/z0.300 halo_pid_B
27584   halos/z0.400 halo_pid_B
26637   halos/z0.500 halo_pid_B
23780   halos/z0.800 halo_pid_B
21020   halos/z1.100 halo_pid_B
18439   halos/z1.400 halo_pid_B
16061   halos/z1.700 halo_pid_B
13899   halos/z2.000 halo_pid_B
10731   halos/z2.500 halo_pid_B
8097    halos/z3.000 halo_pid_B
# Primary 7% halo pid: 254 GB

# For the secondary epochs, we provide only the PID+density file.
# These are slightly smaller because they include only L1 particles, not L0 particles.
11726   halos/z0.150 halo_pid_A
11238   halos/z0.250 halo_pid_A
10743   halos/z0.350 halo_pid_A
10256   halos/z0.450 halo_pid_A
9655    halos/z0.575 halo_pid_A
9292    halos/z0.650 halo_pid_A
8940    halos/z0.725 halo_pid_A
8258    halos/z0.875 halo_pid_A
7941    halos/z0.950 halo_pid_A
7620    halos/z1.025 halo_pid_A
7007    halos/z1.175 halo_pid_A
6716    halos/z1.250 halo_pid_A
6429    halos/z1.325 halo_pid_A
5883    halos/z1.475 halo_pid_A
5628    halos/z1.550 halo_pid_A
5389    halos/z1.625 halo_pid_A
4680    halos/z1.850 halo_pid_A
3621    halos/z2.250 halo_pid_A
2563    halos/z2.750 halo_pid_A
395    halos/z5.000 halo_pid_A
16     halos/z8.000 halo_pid_A
# Secondary 3% halo pid: 144 GB

26003   halos/z0.150 halo_pid_B
24879   halos/z0.250 halo_pid_B
23749   halos/z0.350 halo_pid_B
22643   halos/z0.450 halo_pid_B
21274   halos/z0.575 halo_pid_B
20442   halos/z0.650 halo_pid_B
19659   halos/z0.725 halo_pid_B
18120   halos/z0.875 halo_pid_B
17410   halos/z0.950 halo_pid_B
16685   halos/z1.025 halo_pid_B
15316   halos/z1.175 halo_pid_B
14664   halos/z1.250 halo_pid_B
14026   halos/z1.325 halo_pid_B
12802   halos/z1.475 halo_pid_B
12243   halos/z1.550 halo_pid_B
11719   halos/z1.625 halo_pid_B
10139   halos/z1.850 halo_pid_B
7822    halos/z2.250 halo_pid_B
5509    halos/z2.750 halo_pid_B
806    halos/z5.000 halo_pid_B
34     halos/z8.000 halo_pid_B
# Secondary 7% halo pid: 316 GB

# We provide the rest of the density field, i.e., the complement of the halo set,
# in the subsamples.  These would be used in matter-field statistics or if one wanted
# to associate particles in the periphery of halos.  Or if one wanted to run a different
# group finder (admittedly on only 10% of the dynamical particles).
43432   halos/z0.100 field_rv_A
44044   halos/z0.200 field_rv_A
44724   halos/z0.300 field_rv_A
45445   halos/z0.400 field_rv_A
46195   halos/z0.500 field_rv_A
48542   halos/z0.800 field_rv_A
50898   halos/z1.100 field_rv_A
53164   halos/z1.400 field_rv_A
55295   halos/z1.700 field_rv_A
57277   halos/z2.000 field_rv_A
60189   halos/z2.500 field_rv_A
62610   halos/z3.000 field_rv_A
# Primary 3% field rv: 612 GB

97497   halos/z0.100 field_rv_B
98866   halos/z0.200 field_rv_B
100381  halos/z0.300 field_rv_B
101991  halos/z0.400 field_rv_B
103666  halos/z0.500 field_rv_B
108916  halos/z0.800 field_rv_B
114209  halos/z1.100 field_rv_B
119330  halos/z1.400 field_rv_B
124160  halos/z1.700 field_rv_B
128634  halos/z2.000 field_rv_B
135233  halos/z2.500 field_rv_B
140699  halos/z3.000 field_rv_B
# Primary 7% field rv: 1374 GB

# Field PIDs are probably not used much, but these do relate particles across epochs
# and the PID encodes the initial grid location for Lagrangian displacements.
16508   halos/z0.100 field_pid_A
16556   halos/z0.200 field_pid_A
16627   halos/z0.300 field_pid_A
16709   halos/z0.400 field_pid_A
16800   halos/z0.500 field_pid_A
17124   halos/z0.800 field_pid_A
17468   halos/z1.100 field_pid_A
17809   halos/z1.400 field_pid_A
18128   halos/z1.700 field_pid_A
18443   halos/z2.000 field_pid_A
18833   halos/z2.500 field_pid_A
19093   halos/z3.000 field_pid_A
# Primary 3% field pid: 210 GB

36280   halos/z0.100 field_pid_B
36327   halos/z0.200 field_pid_B
36434   halos/z0.300 field_pid_B
36582   halos/z0.400 field_pid_B
36765   halos/z0.500 field_pid_B
37426   halos/z0.800 field_pid_B
38160   halos/z1.100 field_pid_B
38855   halos/z1.400 field_pid_B
39510   halos/z1.700 field_pid_B
40150   halos/z2.000 field_pid_B
41037   halos/z2.500 field_pid_B
41698   halos/z3.000 field_pid_B
# Primary 7% field pid: 459 GB

# For the full time slices, they are split into L0 and field (non-L0) sets.
# However, this was just due to convenience in the code; the L0 particles
# are not indexed in halo_info.  Only concatenations will be useful.
# The fractional split of L0 to field increases to low redshift.
#
# The position+velocity data is in the pack9 format, which gives
# somewhat higher precision than RVint.  These files average about
# 2.8 TB per epoch, which is 8.5 bytes per particle.
1682068   slices/z0.200/field_pack9
1083440   slices/z0.200/L0_pack9

1864771   slices/z0.800/field_pack9
925909    slices/z0.800/L0_pack9

2061023   slices/z1.400/field_pack9
748046    slices/z1.400/L0_pack9

2242314   slices/z2.000/field_pack9
581440    slices/z2.000/L0_pack9

2371831   slices/z2.500/field_pack9
458272    slices/z2.500/L0_pack9

2480799   slices/z3.000/field_pack9
352132    slices/z3.000/L0_pack9

# The PID and kernel density estimate are in the pid files.
# These average about 0.7 TB per epoch, increasing toward low redshift.
510862    slices/z3.000/field_pack9_pid
88014     slices/z3.000/L0_pack9_pid

501868    slices/z2.500/field_pack9_pid
119812    slices/z2.500/L0_pack9_pid

489859    slices/z2.000/field_pack9_pid
158581    slices/z2.000/L0_pack9_pid

472529    slices/z1.400/field_pack9_pid
216902    slices/z1.400/L0_pack9_pid

454954    slices/z0.800/field_pack9_pid
288682    slices/z0.800/L0_pack9_pid

444629    slices/z0.200/field_pack9_pid
368691    slices/z0.200/L0_pack9_pid