Disk Space
Disk Space#
For users interested in downloading AbacusSummit data, this breakdown of the file sizes of the various data products may be helpful.
The following is an annotated summary of du -BMB
(so units of 1,000,000 bytes, as disk drives use).
This is for AbacusSummit_base_c000_ph008, which is typical, supplemented
by AbacusSummit_base_c000_ph001 for some information about the full time slices.
# The total outputs are 7.85 TB per simulation, if there are no full time slices.
7851475 AbacusSummit_base_c000_ph008
# When there are full time slices, they are about 3.5 TB *each*,
# with a mild trend in redshift, due to slight changes in compression efficiency.
# Sims have between 0 and 12 full time-slice epochs.
3578826 slices/z0.200
3534315 slices/z0.800
3498498 slices/z1.400
3472193 slices/z2.000
3451782 slices/z2.500
3431805 slices/z3.000
# Here's the breakdown; most of the data is in the halos.
# Most users won't need to look at the logs.
1 [top-level directory]
1 info
5649 log
1274121 lightcones
6571688 halos
# The lightcones contain 10% of the particles in rv+pid, and 100% in heal.
# rv: particle pos/vel
# pid: can be used to connect particles in the lightcone to halos in the timeslices;
# also contains the kernel density estimate from the fully sampled density field.
# heal: The Nside=16384, ring=True healpix projection
# These are broken into files for each time step, typically d(ln(1+z)) ~ 0.002.
703830 lightcones/rv
329456 lightcones/heal
240836 lightcones/pid
# The halos come in two flavors: 12 primary time slices with more outputs,
# and then 21 secondary time slices with only limited outputs.
# Most users will only want the primary slices; the secondary ones
# are intended for merger trees and for matching halos onto the light cones.
#
# The primary slices are bigger, typically 400 GB each:
412661 halos/z0.100
412863 halos/z0.200
412748 halos/z0.300
412340 halos/z0.400
411654 halos/z0.500
408158 halos/z0.800
402861 halos/z1.100
396109 halos/z1.400
388176 halos/z1.700
379538 halos/z2.000
364212 halos/z2.500
348863 halos/z3.000
# Total of primary slices: 4750 GB
# The secondary slices are smaller, typically 100 GB and decreasing at high z:
111493 halos/z0.150
110879 halos/z0.250
110041 halos/z0.350
109006 halos/z0.450
107437 halos/z0.575
106324 halos/z0.650
105147 halos/z0.725
102498 halos/z0.875
101105 halos/z0.950
99575 halos/z1.025
96330 halos/z1.175
94619 halos/z1.250
92831 halos/z1.325
89070 halos/z1.475
87182 halos/z1.550
85311 halos/z1.625
79098 halos/z1.850
67995 halos/z2.250
53962 halos/z2.750
11267 halos/z5.000
351 halos/z8.000
# Total of secondary slices: 1822 GB
# But one might opt only to keep certain types of files on disk, so here is the summary
# of the types.
# For example, a minimal installation might be only the halo_info and halo_rv_A files,
# which are 1.1 TB, and perhaps only for some of the primary epochs. E.g., five
# epochs might knock the storage down to 0.5 TB per simulation.
# Another example would be to install only the primary epochs, without the
# field B samples. This saves 1.83 TB per sim, so one is at 6 TB/sim.
# The halo info files contain the stats about all halos, typically 70-75 GB/epoch.
# Note that the file format supports reading only subsets of columns; many users
# will need to load only a small fraction of these.
73188 halos/z0.100 halo_info
74301 halos/z0.200 halo_info
75190 halos/z0.300 halo_info
75864 halos/z0.400 halo_info
76323 halos/z0.500 halo_info
76416 halos/z0.800 halo_info
74722 halos/z1.100 halo_info
71488 halos/z1.400 halo_info
67014 halos/z1.700 halo_info
61615 halos/z2.000 halo_info
51419 halos/z2.500 halo_info
40797 halos/z3.000 halo_info
# Primary halo_info: 818 GB
73764 halos/z0.150 halo_info
74763 halos/z0.250 halo_info
75550 halos/z0.350 halo_info
76107 halos/z0.450 halo_info
76508 halos/z0.575 halo_info
76591 halos/z0.650 halo_info
76548 halos/z0.725 halo_info
76120 halos/z0.875 halo_info
75755 halos/z0.950 halo_info
75270 halos/z1.025 halo_info
74008 halos/z1.175 halo_info
73240 halos/z1.250 halo_info
72376 halos/z1.325 halo_info
70387 halos/z1.475 halo_info
69312 halos/z1.550 halo_info
68204 halos/z1.625 halo_info
64279 halos/z1.850 halo_info
56553 halos/z2.250 halo_info
45891 halos/z2.750 halo_info
10066 halos/z5.000 halo_info
303 halos/z8.000 halo_info
# Secondary halo_info: 1362 GB
# The particles associated to these halos, with 3% consistent subsample
# in A and 7% in B, all indexed out of the halo_info files.
# First, we have the positions and velocities.
# Users painting HOD satellite galaxies into the halos probably could just use the A set.
31415 halos/z0.100 halo_rv_A
30859 halos/z0.200 halo_rv_A
30213 halos/z0.300 halo_rv_A
29498 halos/z0.400 halo_rv_A
28733 halos/z0.500 halo_rv_A
26264 halos/z0.800 halo_rv_A
23699 halos/z1.100 halo_rv_A
21216 halos/z1.400 halo_rv_A
18767 halos/z1.700 halo_rv_A
16457 halos/z2.000 halo_rv_A
12965 halos/z2.500 halo_rv_A
9965 halos/z3.000 halo_rv_A
# Primary 3% halo rv: 280 GB
70448 halos/z0.100 halo_rv_B
69199 halos/z0.200 halo_rv_B
67735 halos/z0.300 halo_rv_B
66123 halos/z0.400 halo_rv_B
64399 halos/z0.500 halo_rv_B
58825 halos/z0.800 halo_rv_B
53036 halos/z1.100 halo_rv_B
47321 halos/z1.400 halo_rv_B
41824 halos/z1.700 halo_rv_B
36626 halos/z2.000 halo_rv_B
28821 halos/z2.500 halo_rv_B
22132 halos/z3.000 halo_rv_B
# Primary 7% halo rv: 626 GB
# And then we have the PIDs, with kernel densities embedded.
# These are used build merger trees, but could also be used
# to track particular particles as part of galaxy assignment,
# e.g., to find the densest particle in a progenitor halo and
# use its late-time position.
13680 halos/z0.100 halo_pid_A
13325 halos/z0.200 halo_pid_A
12947 halos/z0.300 halo_pid_A
12548 halos/z0.400 halo_pid_A
12140 halos/z0.500 halo_pid_A
10870 halos/z0.800 halo_pid_A
9652 halos/z1.100 halo_pid_A
8493 halos/z1.400 halo_pid_A
7421 halos/z1.700 halo_pid_A
6442 halos/z2.000 halo_pid_A
4989 halos/z2.500 halo_pid_A
3777 halos/z3.000 halo_pid_A
# Primary 3% halo pid: 116 GB
30217 halos/z0.100 halo_pid_B
29390 halos/z0.200 halo_pid_B
28501 halos/z0.300 halo_pid_B
27584 halos/z0.400 halo_pid_B
26637 halos/z0.500 halo_pid_B
23780 halos/z0.800 halo_pid_B
21020 halos/z1.100 halo_pid_B
18439 halos/z1.400 halo_pid_B
16061 halos/z1.700 halo_pid_B
13899 halos/z2.000 halo_pid_B
10731 halos/z2.500 halo_pid_B
8097 halos/z3.000 halo_pid_B
# Primary 7% halo pid: 254 GB
# For the secondary epochs, we provide only the PID+density file.
# These are slightly smaller because they include only L1 particles, not L0 particles.
11726 halos/z0.150 halo_pid_A
11238 halos/z0.250 halo_pid_A
10743 halos/z0.350 halo_pid_A
10256 halos/z0.450 halo_pid_A
9655 halos/z0.575 halo_pid_A
9292 halos/z0.650 halo_pid_A
8940 halos/z0.725 halo_pid_A
8258 halos/z0.875 halo_pid_A
7941 halos/z0.950 halo_pid_A
7620 halos/z1.025 halo_pid_A
7007 halos/z1.175 halo_pid_A
6716 halos/z1.250 halo_pid_A
6429 halos/z1.325 halo_pid_A
5883 halos/z1.475 halo_pid_A
5628 halos/z1.550 halo_pid_A
5389 halos/z1.625 halo_pid_A
4680 halos/z1.850 halo_pid_A
3621 halos/z2.250 halo_pid_A
2563 halos/z2.750 halo_pid_A
395 halos/z5.000 halo_pid_A
16 halos/z8.000 halo_pid_A
# Secondary 3% halo pid: 144 GB
26003 halos/z0.150 halo_pid_B
24879 halos/z0.250 halo_pid_B
23749 halos/z0.350 halo_pid_B
22643 halos/z0.450 halo_pid_B
21274 halos/z0.575 halo_pid_B
20442 halos/z0.650 halo_pid_B
19659 halos/z0.725 halo_pid_B
18120 halos/z0.875 halo_pid_B
17410 halos/z0.950 halo_pid_B
16685 halos/z1.025 halo_pid_B
15316 halos/z1.175 halo_pid_B
14664 halos/z1.250 halo_pid_B
14026 halos/z1.325 halo_pid_B
12802 halos/z1.475 halo_pid_B
12243 halos/z1.550 halo_pid_B
11719 halos/z1.625 halo_pid_B
10139 halos/z1.850 halo_pid_B
7822 halos/z2.250 halo_pid_B
5509 halos/z2.750 halo_pid_B
806 halos/z5.000 halo_pid_B
34 halos/z8.000 halo_pid_B
# Secondary 7% halo pid: 316 GB
# We provide the rest of the density field, i.e., the complement of the halo set,
# in the subsamples. These would be used in matter-field statistics or if one wanted
# to associate particles in the periphery of halos. Or if one wanted to run a different
# group finder (admittedly on only 10% of the dynamical particles).
43432 halos/z0.100 field_rv_A
44044 halos/z0.200 field_rv_A
44724 halos/z0.300 field_rv_A
45445 halos/z0.400 field_rv_A
46195 halos/z0.500 field_rv_A
48542 halos/z0.800 field_rv_A
50898 halos/z1.100 field_rv_A
53164 halos/z1.400 field_rv_A
55295 halos/z1.700 field_rv_A
57277 halos/z2.000 field_rv_A
60189 halos/z2.500 field_rv_A
62610 halos/z3.000 field_rv_A
# Primary 3% field rv: 612 GB
97497 halos/z0.100 field_rv_B
98866 halos/z0.200 field_rv_B
100381 halos/z0.300 field_rv_B
101991 halos/z0.400 field_rv_B
103666 halos/z0.500 field_rv_B
108916 halos/z0.800 field_rv_B
114209 halos/z1.100 field_rv_B
119330 halos/z1.400 field_rv_B
124160 halos/z1.700 field_rv_B
128634 halos/z2.000 field_rv_B
135233 halos/z2.500 field_rv_B
140699 halos/z3.000 field_rv_B
# Primary 7% field rv: 1374 GB
# Field PIDs are probably not used much, but these do relate particles across epochs
# and the PID encodes the initial grid location for Lagrangian displacements.
16508 halos/z0.100 field_pid_A
16556 halos/z0.200 field_pid_A
16627 halos/z0.300 field_pid_A
16709 halos/z0.400 field_pid_A
16800 halos/z0.500 field_pid_A
17124 halos/z0.800 field_pid_A
17468 halos/z1.100 field_pid_A
17809 halos/z1.400 field_pid_A
18128 halos/z1.700 field_pid_A
18443 halos/z2.000 field_pid_A
18833 halos/z2.500 field_pid_A
19093 halos/z3.000 field_pid_A
# Primary 3% field pid: 210 GB
36280 halos/z0.100 field_pid_B
36327 halos/z0.200 field_pid_B
36434 halos/z0.300 field_pid_B
36582 halos/z0.400 field_pid_B
36765 halos/z0.500 field_pid_B
37426 halos/z0.800 field_pid_B
38160 halos/z1.100 field_pid_B
38855 halos/z1.400 field_pid_B
39510 halos/z1.700 field_pid_B
40150 halos/z2.000 field_pid_B
41037 halos/z2.500 field_pid_B
41698 halos/z3.000 field_pid_B
# Primary 7% field pid: 459 GB
# For the full time slices, they are split into L0 and field (non-L0) sets.
# However, this was just due to convenience in the code; the L0 particles
# are not indexed in halo_info. Only concatenations will be useful.
# The fractional split of L0 to field increases to low redshift.
#
# The position+velocity data is in the pack9 format, which gives
# somewhat higher precision than RVint. These files average about
# 2.8 TB per epoch, which is 8.5 bytes per particle.
1682068 slices/z0.200/field_pack9
1083440 slices/z0.200/L0_pack9
1864771 slices/z0.800/field_pack9
925909 slices/z0.800/L0_pack9
2061023 slices/z1.400/field_pack9
748046 slices/z1.400/L0_pack9
2242314 slices/z2.000/field_pack9
581440 slices/z2.000/L0_pack9
2371831 slices/z2.500/field_pack9
458272 slices/z2.500/L0_pack9
2480799 slices/z3.000/field_pack9
352132 slices/z3.000/L0_pack9
# The PID and kernel density estimate are in the pid files.
# These average about 0.7 TB per epoch, increasing toward low redshift.
510862 slices/z3.000/field_pack9_pid
88014 slices/z3.000/L0_pack9_pid
501868 slices/z2.500/field_pack9_pid
119812 slices/z2.500/L0_pack9_pid
489859 slices/z2.000/field_pack9_pid
158581 slices/z2.000/L0_pack9_pid
472529 slices/z1.400/field_pack9_pid
216902 slices/z1.400/L0_pack9_pid
454954 slices/z0.800/field_pack9_pid
288682 slices/z0.800/L0_pack9_pid
444629 slices/z0.200/field_pack9_pid
368691 slices/z0.200/L0_pack9_pid