Disk Space
For users interested in downloading AbacusSummit data, this breakdown of the file sizes of the various data products may be helpful.
The following is an annotated summary of du -BMB
(so units of 1,000,000 bytes, as disk drives use).
This is for AbacusSummit_base_c000_ph008, which is typical, supplemented
by AbacusSummit_base_c000_ph001 for some information about the full time slices.
# The total outputs are 7.85 TB per simulation, if there are no full time slices.
7851475 AbacusSummit_base_c000_ph008
# When there are full time slices, they are about 3.5 TB *each*,
# with a mild trend in redshift, due to slight changes in compression efficiency.
# Sims have between 0 and 12 full time-slice epochs.
3578826 slices/z0.200
3534315 slices/z0.800
3498498 slices/z1.400
3472193 slices/z2.000
3451782 slices/z2.500
3431805 slices/z3.000
# Here's the breakdown; most of the data is in the halos.
# Most users won't need to look at the logs.
1 [top-level directory]
1 info
5649 log
1274121 lightcones
6571688 halos
# The lightcones contain 10% of the particles in rv+pid, and 100% in heal.
# rv: particle pos/vel
# pid: can be used to connect particles in the lightcone to halos in the timeslices;
# also contains the kernel density estimate from the fully sampled density field.
# heal: The Nside=16384, ring=True healpix projection
# These are broken into files for each time step, typically d(ln(1+z)) ~ 0.002.
703830 lightcones/rv
329456 lightcones/heal
240836 lightcones/pid
# The halos come in two flavors: 12 primary time slices with more outputs,
# and then 21 secondary time slices with only limited outputs.
# Most users will only want the primary slices; the secondary ones
# are intended for merger trees and for matching halos onto the light cones.
#
# The primary slices are bigger, typically 400 GB each:
412661 halos/z0.100
412863 halos/z0.200
412748 halos/z0.300
412340 halos/z0.400
411654 halos/z0.500
408158 halos/z0.800
402861 halos/z1.100
396109 halos/z1.400
388176 halos/z1.700
379538 halos/z2.000
364212 halos/z2.500
348863 halos/z3.000
# Total of primary slices: 4750 GB
# The secondary slices are smaller, typically 100 GB and decreasing at high z:
111493 halos/z0.150
110879 halos/z0.250
110041 halos/z0.350
109006 halos/z0.450
107437 halos/z0.575
106324 halos/z0.650
105147 halos/z0.725
102498 halos/z0.875
101105 halos/z0.950
99575 halos/z1.025
96330 halos/z1.175
94619 halos/z1.250
92831 halos/z1.325
89070 halos/z1.475
87182 halos/z1.550
85311 halos/z1.625
79098 halos/z1.850
67995 halos/z2.250
53962 halos/z2.750
11267 halos/z5.000
351 halos/z8.000
# Total of secondary slices: 1822 GB
# But one might opt only to keep certain types of files on disk, so here is the summary
# of the types.
# For example, a minimal installation might be only the halo_info and halo_rv_A files,
# which are 1.1 TB, and perhaps only for some of the primary epochs. E.g., five
# epochs might knock the storage down to 0.5 TB per simulation.
# Another example would be to install only the primary epochs, without the
# field B samples. This saves 1.83 TB per sim, so one is at 6 TB/sim.
# The halo info files contain the stats about all halos, typically 70-75 GB/epoch.
# Note that the file format supports reading only subsets of columns; many users
# will need to load only a small fraction of these.
73188 halos/z0.100 halo_info
74301 halos/z0.200 halo_info
75190 halos/z0.300 halo_info
75864 halos/z0.400 halo_info
76323 halos/z0.500 halo_info
76416 halos/z0.800 halo_info
74722 halos/z1.100 halo_info
71488 halos/z1.400 halo_info
67014 halos/z1.700 halo_info
61615 halos/z2.000 halo_info
51419 halos/z2.500 halo_info
40797 halos/z3.000 halo_info
# Primary halo_info: 818 GB
73764 halos/z0.150 halo_info
74763 halos/z0.250 halo_info
75550 halos/z0.350 halo_info
76107 halos/z0.450 halo_info
76508 halos/z0.575 halo_info
76591 halos/z0.650 halo_info
76548 halos/z0.725 halo_info
76120 halos/z0.875 halo_info
75755 halos/z0.950 halo_info
75270 halos/z1.025 halo_info
74008 halos/z1.175 halo_info
73240 halos/z1.250 halo_info
72376 halos/z1.325 halo_info
70387 halos/z1.475 halo_info
69312 halos/z1.550 halo_info
68204 halos/z1.625 halo_info
64279 halos/z1.850 halo_info
56553 halos/z2.250 halo_info
45891 halos/z2.750 halo_info
10066 halos/z5.000 halo_info
303 halos/z8.000 halo_info
# Secondary halo_info: 1362 GB
# The particles associated to these halos, with 3% consistent subsample
# in A and 7% in B, all indexed out of the halo_info files.
# First, we have the positions and velocities.
# Users painting HOD satellite galaxies into the halos probably could just use the A set.
31415 halos/z0.100 halo_rv_A
30859 halos/z0.200 halo_rv_A
30213 halos/z0.300 halo_rv_A
29498 halos/z0.400 halo_rv_A
28733 halos/z0.500 halo_rv_A
26264 halos/z0.800 halo_rv_A
23699 halos/z1.100 halo_rv_A
21216 halos/z1.400 halo_rv_A
18767 halos/z1.700 halo_rv_A
16457 halos/z2.000 halo_rv_A
12965 halos/z2.500 halo_rv_A
9965 halos/z3.000 halo_rv_A
# Primary 3% halo rv: 280 GB
70448 halos/z0.100 halo_rv_B
69199 halos/z0.200 halo_rv_B
67735 halos/z0.300 halo_rv_B
66123 halos/z0.400 halo_rv_B
64399 halos/z0.500 halo_rv_B
58825 halos/z0.800 halo_rv_B
53036 halos/z1.100 halo_rv_B
47321 halos/z1.400 halo_rv_B
41824 halos/z1.700 halo_rv_B
36626 halos/z2.000 halo_rv_B
28821 halos/z2.500 halo_rv_B
22132 halos/z3.000 halo_rv_B
# Primary 7% halo rv: 626 GB
# And then we have the PIDs, with kernel densities embedded.
# These are used build merger trees, but could also be used
# to track particular particles as part of galaxy assignment,
# e.g., to find the densest particle in a progenitor halo and
# use its late-time position.
13680 halos/z0.100 halo_pid_A
13325 halos/z0.200 halo_pid_A
12947 halos/z0.300 halo_pid_A
12548 halos/z0.400 halo_pid_A
12140 halos/z0.500 halo_pid_A
10870 halos/z0.800 halo_pid_A
9652 halos/z1.100 halo_pid_A
8493 halos/z1.400 halo_pid_A
7421 halos/z1.700 halo_pid_A
6442 halos/z2.000 halo_pid_A
4989 halos/z2.500 halo_pid_A
3777 halos/z3.000 halo_pid_A
# Primary 3% halo pid: 116 GB
30217 halos/z0.100 halo_pid_B
29390 halos/z0.200 halo_pid_B
28501 halos/z0.300 halo_pid_B
27584 halos/z0.400 halo_pid_B
26637 halos/z0.500 halo_pid_B
23780 halos/z0.800 halo_pid_B
21020 halos/z1.100 halo_pid_B
18439 halos/z1.400 halo_pid_B
16061 halos/z1.700 halo_pid_B
13899 halos/z2.000 halo_pid_B
10731 halos/z2.500 halo_pid_B
8097 halos/z3.000 halo_pid_B
# Primary 7% halo pid: 254 GB
# For the secondary epochs, we provide only the PID+density file.
# These are slightly smaller because they include only L1 particles, not L0 particles.
11726 halos/z0.150 halo_pid_A
11238 halos/z0.250 halo_pid_A
10743 halos/z0.350 halo_pid_A
10256 halos/z0.450 halo_pid_A
9655 halos/z0.575 halo_pid_A
9292 halos/z0.650 halo_pid_A
8940 halos/z0.725 halo_pid_A
8258 halos/z0.875 halo_pid_A
7941 halos/z0.950 halo_pid_A
7620 halos/z1.025 halo_pid_A
7007 halos/z1.175 halo_pid_A
6716 halos/z1.250 halo_pid_A
6429 halos/z1.325 halo_pid_A
5883 halos/z1.475 halo_pid_A
5628 halos/z1.550 halo_pid_A
5389 halos/z1.625 halo_pid_A
4680 halos/z1.850 halo_pid_A
3621 halos/z2.250 halo_pid_A
2563 halos/z2.750 halo_pid_A
395 halos/z5.000 halo_pid_A
16 halos/z8.000 halo_pid_A
# Secondary 3% halo pid: 144 GB
26003 halos/z0.150 halo_pid_B
24879 halos/z0.250 halo_pid_B
23749 halos/z0.350 halo_pid_B
22643 halos/z0.450 halo_pid_B
21274 halos/z0.575 halo_pid_B
20442 halos/z0.650 halo_pid_B
19659 halos/z0.725 halo_pid_B
18120 halos/z0.875 halo_pid_B
17410 halos/z0.950 halo_pid_B
16685 halos/z1.025 halo_pid_B
15316 halos/z1.175 halo_pid_B
14664 halos/z1.250 halo_pid_B
14026 halos/z1.325 halo_pid_B
12802 halos/z1.475 halo_pid_B
12243 halos/z1.550 halo_pid_B
11719 halos/z1.625 halo_pid_B
10139 halos/z1.850 halo_pid_B
7822 halos/z2.250 halo_pid_B
5509 halos/z2.750 halo_pid_B
806 halos/z5.000 halo_pid_B
34 halos/z8.000 halo_pid_B
# Secondary 7% halo pid: 316 GB
# We provide the rest of the density field, i.e., the complement of the halo set,
# in the subsamples. These would be used in matter-field statistics or if one wanted
# to associate particles in the periphery of halos. Or if one wanted to run a different
# group finder (admittedly on only 10% of the dynamical particles).
43432 halos/z0.100 field_rv_A
44044 halos/z0.200 field_rv_A
44724 halos/z0.300 field_rv_A
45445 halos/z0.400 field_rv_A
46195 halos/z0.500 field_rv_A
48542 halos/z0.800 field_rv_A
50898 halos/z1.100 field_rv_A
53164 halos/z1.400 field_rv_A
55295 halos/z1.700 field_rv_A
57277 halos/z2.000 field_rv_A
60189 halos/z2.500 field_rv_A
62610 halos/z3.000 field_rv_A
# Primary 3% field rv: 612 GB
97497 halos/z0.100 field_rv_B
98866 halos/z0.200 field_rv_B
100381 halos/z0.300 field_rv_B
101991 halos/z0.400 field_rv_B
103666 halos/z0.500 field_rv_B
108916 halos/z0.800 field_rv_B
114209 halos/z1.100 field_rv_B
119330 halos/z1.400 field_rv_B
124160 halos/z1.700 field_rv_B
128634 halos/z2.000 field_rv_B
135233 halos/z2.500 field_rv_B
140699 halos/z3.000 field_rv_B
# Primary 7% field rv: 1374 GB
# Field PIDs are probably not used much, but these do relate particles across epochs
# and the PID encodes the initial grid location for Lagrangian displacements.
16508 halos/z0.100 field_pid_A
16556 halos/z0.200 field_pid_A
16627 halos/z0.300 field_pid_A
16709 halos/z0.400 field_pid_A
16800 halos/z0.500 field_pid_A
17124 halos/z0.800 field_pid_A
17468 halos/z1.100 field_pid_A
17809 halos/z1.400 field_pid_A
18128 halos/z1.700 field_pid_A
18443 halos/z2.000 field_pid_A
18833 halos/z2.500 field_pid_A
19093 halos/z3.000 field_pid_A
# Primary 3% field pid: 210 GB
36280 halos/z0.100 field_pid_B
36327 halos/z0.200 field_pid_B
36434 halos/z0.300 field_pid_B
36582 halos/z0.400 field_pid_B
36765 halos/z0.500 field_pid_B
37426 halos/z0.800 field_pid_B
38160 halos/z1.100 field_pid_B
38855 halos/z1.400 field_pid_B
39510 halos/z1.700 field_pid_B
40150 halos/z2.000 field_pid_B
41037 halos/z2.500 field_pid_B
41698 halos/z3.000 field_pid_B
# Primary 7% field pid: 459 GB
# For the full time slices, they are split into L0 and field (non-L0) sets.
# However, this was just due to convenience in the code; the L0 particles
# are not indexed in halo_info. Only concatenations will be useful.
# The fractional split of L0 to field increases to low redshift.
#
# The position+velocity data is in the pack9 format, which gives
# somewhat higher precision than RVint. These files average about
# 2.8 TB per epoch, which is 8.5 bytes per particle.
1682068 slices/z0.200/field_pack9
1083440 slices/z0.200/L0_pack9
1864771 slices/z0.800/field_pack9
925909 slices/z0.800/L0_pack9
2061023 slices/z1.400/field_pack9
748046 slices/z1.400/L0_pack9
2242314 slices/z2.000/field_pack9
581440 slices/z2.000/L0_pack9
2371831 slices/z2.500/field_pack9
458272 slices/z2.500/L0_pack9
2480799 slices/z3.000/field_pack9
352132 slices/z3.000/L0_pack9
# The PID and kernel density estimate are in the pid files.
# These average about 0.7 TB per epoch, increasing toward low redshift.
510862 slices/z3.000/field_pack9_pid
88014 slices/z3.000/L0_pack9_pid
501868 slices/z2.500/field_pack9_pid
119812 slices/z2.500/L0_pack9_pid
489859 slices/z2.000/field_pack9_pid
158581 slices/z2.000/L0_pack9_pid
472529 slices/z1.400/field_pack9_pid
216902 slices/z1.400/L0_pack9_pid
454954 slices/z0.800/field_pack9_pid
288682 slices/z0.800/L0_pack9_pid
444629 slices/z0.200/field_pack9_pid
368691 slices/z0.200/L0_pack9_pid