Introduction
This data release is a compilation of known landslides, debris flows, lahars, and outburst floods that generated seismic signals observable on existing seismic networks. The data release includes basic information about each event such as location, volume, area, and runout distances as well as information about seismic detections and the location of seismic data, photos, maps, GIS files, and links to papers, websites, and media reports about the event. Not all record types exist for each event, and the quality of the information varies from event to event.
While the SQLite3 database (lsseis.db) is the native format of this database and preserves its relational structure, for the convenience of users, we extracted summary csv files of the table of events (Events.csv), which summarizes the locations and basic information about each event included in the collection and all of the references used (references.csv) from the database. We also extracted csv files specific to each event contained in the event_data.zip file. When unzipped, it contains a folder for each event that contains .csv files that summarize the sources of information we used for the event (*_information.csv), what references were used (*_references.csv), a list of seismic detections (*_seismic_detections.csv), a list of photos and figures (*_photos_figures.csv) and a list of maps and GIS files (*_maps_gis.csv), where the * indicates the unique event id (Eid), name, and date of each event. This folder will also contain any accompanying files such as photos, maps, and seismic data (if not already archived). Not all files exist for all events. These csv files are described in the metadata. The methods used to compile the database are described in the following sections.
Methods
We identified candidate events for inclusion by searching scientific journals, agency reports, blogs, and websites. We obtained additional event notifications through direct communication with scientists, particularly at the volcano observatories and local seismic networks. We also searched through archives and activity logs from those observatories and networks. We primarily include events for which digital seismic data are available, one has solely paper records. Our oldest event is from 1977, the oldest for which there are digital data is from 1989, but most events are from 2004 onward because the continuous saving of waveform data occurred around that time, making it easy to obtain seismic data for past events that were not catalogued in earthquake catalogs.
Once identified, each event was entered into a relational SQLite3 database with a main event table (events) consisting of basic event information. Then as each event was processed, we populated additional tables with seismic detections, links to references, photos, imagery, geospatial files, and notes about where and how we obtained the values entered in the basic event information table for reproducibility (saved in the references, photos, gisfiles, and information tables). Photograph location coordinates were extracted from the photo metadata, if present. There is no guarantee of the accuracy of the photo locations as that is dependent on the equipment used by the photographer.
The database was populated through two main analysis steps. First, we populated the basic event information fields, then we performed a detailed seismic analysis of all available seismic data out to detection limits.
Basic event information
Table 1 describes the basic information that is included in the catalogue, though for some events, we did not have sufficient information (imagery, photos, published reports) to fill some of the fields, so in those cases the fields are left blank. The completeness and quality of these fields for a given event depends on the amount and quality of information available to us at the time of processing. Information availability varies widely from event to event, some events are detailed in dedicated peer-reviewed publications while others simply have a seismic detection and approximate location or eyewitness report. To reflect this variability, we classify each event with a quality factor from 1 to 5 based on the ancillary data available using the scale described under OtherDataQuality in Table 1.
Some of the basic event data described in Table 1 was seismically derived (StartTime, EndTime, LPpotential, maxdistHF_km, maxdistLP_km, DatLocation), and the methods for extracting these values are described in the next section. The event type was derived either from the event type used in a detailed peer-reviewed publication of the event, or obtained based on the event characteristics, following the classification scheme proposed by Hungr and others (2014), which is an expansion of Varnes (1978), because the Hungr and others (2014) scheme is more suitable to describe the characteristics of this event set. We deviate from the Hungr and others (2014) naming scheme in a few ways 1) we use the term outburst flood instead of their proposed “debris flood” where appropriate to stress the sudden nature of some of the included events, and 2) we call all avalanching events on volcanoes that involve both ice and rock from the volcano “rock and ice avalanches” even though the rock material could be considered debris in some cases because the distinction is not always clear. The term debris avalanche is still used for some events if the peer-reviewed publications use that term or if the source material was clearly debris and not rock. Note that even though some of the rock avalanches start with primarily vertical falls, they are termed rock avalanches when the downslope movement continues after the fall in an avalanching style. Compound events where two types of movement are considered important enough to the signal character to note separately are given both names in order of occurrence. When the event type is unknown, such as when there was no visual confirmation or other information available, landslide is used as a catch-all term. Landslide terminology (crown, toe etc.) is used as defined in Cruden and Varnes (1996). The location uncertainty (LocUncert_km) was set to zero when imagery existed or exact coordinates were reported. For events where only photos were available and thus landmarks were used to determine locations, the uncertainty of the location of the crown was used. If the location was solely based on a seismic location, then the reported seismic location uncertainty was used.
Table 1: Description of Basic Event Information (Events) Table Fields
Events Table Field |
Units or Format |
Description |
Eid |
Integer |
Unique event identification number for each entry. |
Name |
N/A |
Name given to each event entry based on location characteristics. Related events are further distinguished by order of occurrence. i.e. (Oso main, Oso secondary, West Salt Creek precursory, South Twin 1, South Twin 2, South Twin 3) |
StartTime |
YYYY-MM-DD HH:MM:SS |
Approximate time of first observable arrivals on closest seismic stations, in UTC. Actual start time of surface movement is earlier by an unknown amount in most cases due to emergent onsets and wave travel times. The start time is taken as the earliest point where the signal emerges from the background noise on the closest three stations that have duration picks in the high frequency band rounded down to the nearest second. i.e. (2005-05-15 17:11:40). |
EndTime |
YYYY-MM-DD HH:MM:SS |
Approximate end time, in UTC, of seismic signal generated by the surface movement. Chosen as the latest point where the signal disappears back into the background noise on the closest three stations for which there are duration picks in the high frequency band, rounded up to the nearest second i.e. (2005-05-15 17:16:56) |
Latitude |
Decimal degrees |
Latitude of the source area of the event, same as Crown_lat if it exists, otherwise equal to the best estimate with an associated uncertainty (LocUncert_km) |
Longitude |
Decimal degrees |
Longitude of the source area of the event, same as Crown_lon if it exists, otherwise equal to the best estimate with an associated uncertainty (LocUncert_km) |
LocUncert_km |
km |
Estimate of the uncertainty of the event location (for events without imagery) |
Crown_lat |
Decimal degrees |
Latitude of the most upslope point of each surface expression (crown) in decimal degrees. Only provided for events with published locations, imagery, or photos. |
Crown_lon |
Decimal degrees |
Longitude of the most upslope point of each surface expression (crown) in decimal degrees. Only provided for events with published locations, imagery, or photos. |
Tip_lat |
Decimal degrees |
Latitude of the most distal point of toe (tip) of each surface expression in decimal degrees. Only provided for events with published locations, imagery, or photos. |
Tip_lon |
Decimal degrees |
Longitude of the most distal point of toe (tip) of each surface expression in decimal degrees. Only provided for events with published locations, imagery, or photos. |
Type |
N/A |
Landslide type using name in published peer-reviewed journal or when no published type is available, following the Hungr and others (2014) classification approach, which is a modification of Varnes (1978) based on available information about the event. The generic term “landslide” is used when insufficient information is available to classify. |
Area_total |
m2 |
Best estimate of the total area of the surface expression (source and deposition). Values from published sources or estimated from satellite imagery. |
Area_source |
m2 |
Best estimate of area of source region. Values from published sources or estimated from satellite imagery. |
Area_source_low |
m2 |
Lower bound estimate for area of source region. Values from published sources or estimated from satellite imagery. |
Area_source_high |
m2 |
Upper bound estimate for area of source region. Values from published sources or estimated from satellite imagery. |
Volume |
m3 |
Best estimate of volume of material moved during surface event. Values from published sources or estimated from Area_source using methodology described in the methods section. |
Volume_low |
m3 |
Lower bound estimate for volume of material moved during surface event. Values from published sources or estimated from Area_source and assuming Area_source_low and Area_source_high correspond to one standard deviation using methodology described in methods section. |
Volume_high |
m3 |
Upper bound estimate for volume of material moved during surface event. Values from published sources or estimated from Area_source and assuming Area_source_low and Area_source_high correspond to one standard deviation using methodology described in methods section. |
Mass |
kg |
Mass of material moved during surface failure. Only reported when there are published values even when volumes are available due to uncertainties in material density, especially for events involving significant amounts of ice and/or water. |
Mass_low |
kg |
Lower bound estimate of mass moved during surface failure. Only reported when there are published values. |
Mass_high |
kg |
Upper bound estimate of mass moved during surface failure. Only reported when there are published values. |
H |
m |
Height of each landslide measured from crown to tip of the toe. Values are from published sources or estimated by subtracting toe tip elevation from crown elevation. |
H_low |
m |
Lower bound estimate for height. Values from published sources or estimated based on uncertainty in crown and toe tip elevation. |
H_high |
m |
Upper bound estimate for height. Values from published sources or estimated based on uncertainty in crown and toe tip elevation. |
L |
m |
Total length of centerline of each failure from crown to tip of the toe. Values from published sources or estimated by measuring the centerline of each event with available imagery. |
L_low |
m |
Lower bound estimate for total length. Values from published sources or estimated based on uncertainty in centerline length. |
L_high |
m |
Lower bound estimate for total length. Values from published sources or estimated based on uncertainty in centerline length. |
OtherDataQuality |
Integer |
Relative quality of ancillary data about event (imagery, media reports, publications etc.), each successive ranking is inclusive of the criteria before it:
1 = seismic detection but no visual documentation; 2 = photographic documentation; 3 = media reports, non-peer reviewed scientific reports, and/or inclusion in broad peer-reviewed scientific study; 4 = satellite imagery available; 5 = detailed peer-reviewed scientific study |
LPpotential |
Integer |
0 = no long period (>20 sec) waves observed, 1 = clear long period waves observed, 2 = weak or questionable long periods observed |
maxdistHF_km |
km |
Maximum distance that the high frequency seismic signal (1-5 Hz) was detected above the noise level on available seismic station data. Distance measured from latitude and longitude fields, which correspond in most cases to crown location. |
maxdistHF_reached |
boolean |
True (1) if the furthest distance that the high frequency seismic signal (1-5 Hz) was detected above the noise level on available seismic data was reached, False (0) if the signal is likely visible further than maxdistHF_km, but was not examined, Null if not applicable (e.g. no HF signal detected). |
maxdistLP_km |
km |
Maximum distance, in km, that the long period seismic signal (>20 sec on displacement waveforms) was detected above the noise level on available seismic data. Distance measured from latitude and longitude fields, which correspond in most cases to crown location. |
maxdistLP_reached |
boolean |
True (1) if the furthest distance that the long period seismic signal (>20 sec on displacement waveforms) was detected above the noise level on available seismic records was reached, False (0) if the signal is likely visible further than maxdistLP_km, but was not examined, Null if not applicable (e.g. no LP signal detected). |
DatLocation |
|
Location of seismic data (e.g. IRIS, NCEDC, relative filename(s) if attached) |
Area and Path Measurements
The rest of the fields in Table 1 require spatial analysis. We use published values of area, volume, mass, total vertical drop height (H), and total horizontal length (L) when available. In some studies, calculations are rigorous, in others the values are simply rough estimates. We capture uncertainty in published values by including the full possible range, when provided. In cases where different groups published differing estimates, we include the full range of all of the estimates and make a decision based on the quality of methodology for the best estimate. The source of each basic event information entry for each event is documented.
Published values do not exist for many of these events. Since the relation between event size and its seismic signal is of particular interest, we wanted to make the database as complete as possible in this aspect. Therefore, we made these calculations ourselves where possible using imagery and photos, superseding published values only when our methods were more rigorous than those described in the publication. To do so, we acquired satellite imagery from Digital Globe, Google Earth, Landsat, or NASA Earth Observatory. Satellite images were chosen based on visibility of event and proximity to event date. In most cases, we do not have permission to redistribute the raw satellite imagery, but we include static maps in these cases. The earliest satellite image after the event where the source area and deposit can be clearly distinguished was used for spatial analysis.
For each event for which we performed spatial analysis, we digitized an outline of the approximate total area based on surface disruptions like disturbed vegetation and changes in snow or soil color. We also estimated and digitized the landslide source area based on surface morphology observed in imagery and photos. However, source areas are difficult to estimate based on imagery alone, so a largest reasonable area (Area_source_high), smallest reasonable area (Area_source_low), and best estimate (Area_source) are all included.
Using the same imagery, we made estimations of runout distances (L) and drop heights (H) if there were no published values based on a midpoint line that is visually projected along the center of the failure in the direction of movement from the crown to the tip of the toe. The horizontal distance from crown to tip, L, is the length of this centerline in meters. The height H of the failure is the elevation difference between the topmost point of the centerline (crown) and the bottommost (tip). Where clear imagery of sufficiently high resolution was not available but photos were (generally the case for older events), we used landmarks in the landscape to find the approximate location of the crown and tip of the event in Google Earth. These points were then used to estimate and digitize the runout path, and consequently H and L. Elevation data used to calculate H are either obtained from the USGS National Elevation Dataset (NED) or from Google Earth’s topography model. In all cases, elevations and thus elevation differences are approximate and do not typically account for changes due to the event itself. Events with multiple runout directions, variable topography near the crown or toe, or unclear crown and tip locations result in uncertainties in H and L. In these cases, a maximum and minimum bound are also estimated (H_low, H_high, L_low, L_high). All of our digitized outlines and centerlines were projected into UTM coordinates before extracting area and distance measurements to reduce distortion.
Figure 1 shows an example event for which we performed this analysis. The quality and resolution of these outlines and their corresponding values are uncertain as they are dependent on the resolution and clarity of the satellite images and how the landscape changed in the time elapsed between the event and the date of acquisition. The period between the event date and imagery date vary considerably, from a few days to a few years (Table A1). It is more difficult to estimate areas for events with larger time separations because they are more likely to suffer from erosion, snow coverage, deposit washout, or additional slope movement.
Figure 1: Spatial analysis of Lamplugh Glacier Landslide (Eid 81). ArcMap World Imagery overlain by Digital Globe satellite image (top). Various point, line, and area designations shown in (bottom) figure legend. Total and source area boundaries are additive of all previous areas.
Volume and Mass estimation
Using the published or derived source areas, we then estimated volumes for events that do not have published values using a scaling relation between area and volume defined by Larsen and others (2010) based on bedrock scar geometry (Area_source). The exception is the 31 May 2013 debris slide that mobilized into a debris flow at Mount Baker, Washington (Eid 53) for which we used the relations from Larsen and others (2010) for soil instead of bedrock.
Volumes are calculated using our best estimate of the source area (Area_source), while upper and lower bounds (Volume_high and Volume_low) are found by adding or subtracting the volume error from the best estimate of volume. The volume error is found by propagating the uncertainties of source area as well as the variables . To be conservative, we assume the high and low source area estimates correspond to plus and minus one standard deviation, even though they are upper and lower limits on what is reasonable. The uncertainties of the equation parameters are as reported by Larsen and others (2010).
For two rock fall sequences, the Nisqually rock falls at Mount Rainier, WA and the South Twin rock falls near Illiamna volcano, AK (event ids 25-30 and 51, 66-72, respectively), the total volume was estimated using the methods above and the volume was distributed between the events proportionally based on the relative amplitudes on a designated nearby station (LON.BHZ.UW and ILS.EHZ.AV, respectively) bandpass filtered from 1 to 5 Hz. This approach is only justifiable when the event types and paths are the same or very similar (Norris, 1994), but these criteria hold true for these two sequences where numerous failures occurred from the same slope over hours or days. However, the volume estimates are likely highly uncertain and this is reflected in their reported uncertainties, though the uncertainty range could be higher if, for example, a single event actually reflects two collapses closely spaced in time. We only report masses when there are published values, generally these are estimated directly from seismic data and generally have large but sometimes unreported uncertainties.
Seismic data analysis
We examined the available seismic waveforms for each event included in the catalogue primarily from two perspectives, high frequency waveforms (1-5 Hz), referred to as HF in field names, and long period displacement waveforms (20-60 sec), referred to as LP. To avoid unnecessarily discarding data that could not be corrected to physical units, high frequency (HF) detectability was determined by analyzing bandpass-filtered raw waveforms without performing a station response correction. Long period (LP) detectability was only done for stations for which station response information was available on corrected waveforms. Many events have significant energy outside of the two bands analyzed. While we did not specifically target any other frequency bands, these characteristics are reflected in the standardized spectra and spectrograms that we generated for each event, discussed in the last paragraph.
Once a location and event time were established, we performed an automated search for stations that were operating nearby at that time using the International Federation of Digital Seismograph Networks (FDSN) web services client for ObsPy. Due to the location of our events, we only searched for stations through the Incorporated Research Institutions for Seismology (IRIS) and Northern California Earthquake Data Center (NCEDC) web service providers. Except for small events such as local debris flows at volcanoes, we populated a table named sta_nearby for each event that linked stations summarized in the stations table to each event described in the events table with source to station distance, back azimuth, and azimuth, computed using pyproj with the WGS84 ellipsoid. Note that the coordinates for most events corresponds to the uppermost location of the source area (when known) and most events travelled considerable distances from the source area so these values in reality changed significantly over the duration of the event and represent just the starting conditions. Once the nearby stations were known, we downloaded any available data for these stations, moving incrementally toward greater radii from the source. These data were inspected in the two aforementioned frequency bands for detectability based on whether or not the signal was visible above the noise level. These detections were noted in sta_nearby.
There were several general seismic parameters that were investigated in the two aforementioned frequency bands for each event generally and presented on the main events table (Table 1). These include the maximum distance at which the signal was visible above the noise level (maxdistHF_km and maxdistLP_km), also referred to as the detection limit. This was determined by inspecting the available seismic data in each of the two frequency bands (HF and LP), finding the most distant visible detection, and then further inspecting data for tens to hundreds of kilometers further (depending on the size of the event) to ensure that the event was not detectable on more distant quiet stations. If not, the furthest station with a detection, rounded up to the nearest kilometer, was determined to be the detection limit for that frequency band. Though the signal needed to be above the noise level to be counted, in many cases, the detection would not have been noticeable if not included in a record section organized by distance from the source to indicate that a low amplitude increase in noise was actually signal from a surface event. We did not find the maximum detection limits in the long period range for most events because for some of the larger events it would have involved investigating seismic data from the entire globe and was time consuming without adding much value. This is also true for a few of the largest events in the high frequency (HF) range. We also did not reach the detection limits for older events where the only data we have is trigger data from a local network so more distant data do not exist anymore. The entry for maxdistHF_reached and maxdistLP_reached is 1 for events that we examined to their greatest detectable distances, 0 for those we did not. We also report whether an event generated observable long period (>20 sec) waveforms by setting the LPpotential field to 1 if such waveforms were clearly detectable above the noise level, 2 if long period waveforms were observable but barely above the noise level or possibly coincidental noise, and 0 if long period waveforms were not observable.
The event start time and end time reported in the main table are also extracted seismically and are approximate. They do not correspond to the actual event start time and end time of the event, but to the seismically observable start and end time on the closest station. Ideally this is taken from the closest station to the event, but since most signals emerge from the noise, the noise level controls the first observable start time. Since the noise level varies between stations, we consider the earliest start time of the arrival times picked visually on the closest three stations. We use only the times observed in the high frequency band because at long periods, acausal filtering can make it difficult to pick start times with enough precision. We do not attempt to correct for travel times so for events that have very proximate stations, the start time may be closer to the actual event time than more distant stations. Therefore, the actual event time is earlier by an amount that varies from event to event based on station proximity and noise levels. To represent this uncertainty, we report pick times rounded to the nearest second.
For each event, in addition to any photos and figures from outside sources, we generated standard seismic figures including a record section from the five closest stations with detections (if applicable) of raw waveforms normalized by their peak amplitudes (Figure 2a), high frequency (1-5 Hz) velocity waveforms plotted showing their absolute amplitudes (Figure 2b), spectrograms and power spectra (multitaper) of waveforms with sensitivity removed (Figure 2d and e). There are two additional figures for events with long period detections: record sections of the long period (20-60 sec) displacement waveforms and displacement spectra from the five closest stations with long period detections (Figure 2c and f). Note that station response was removed without any pre-filtering for the long period spectral plots, but not for the general spectral plots. For the general spectral plots only sensitivity was removed because the intermediate and short period stations have a wide range of frequency responses and accounting for and documenting all of those differences in an automated fashion was deemed impractical. The figures show both the signals and the noise and as these are automated plots, we do not indicate what parts are signal and which are noise. This is of particular note for the spectra of long-period waveforms corrected for station response without pre-filtering which can amplify noise at long periods, but is generally true for all automated plots.
Figure 2: Example of waveform and spectral plots that are generated for each event in the catalogue, in this case for the signals generated by the 22 May 2016 rock and ice avalanche on Red Glacier at Iliamna volcano, AK (Eid 75) showing (a) raw unfiltered normalized waveforms, (b) velocity waveforms corrected for station response with 1-5 Hz prefiltering, (c) displacement waveforms corrected for station response with 20-60 sec prefiltering, (d) spectrograms and (e) velocity power spectra of waveforms shown in a, on data corrected for sensitivity only and F) displacement power spectra of broadband data shown in c corrected for station response without prefiltering. Note that O20K.BDF.20.TA is an infrasound station and thus records acoustic waves traveling through the atmosphere, thus explaining the time delays of this record relative to the other ones.
References
Cruden, D.M., Varnes, D.J., 1996. Landslide types and processes. In: Turner, A.K., Schuster, R.L. (Eds.), Landslides Investigation and Mitigation, Special Report 247. National Academy Press, Washington DC, pp. 36–75.
Hungr, O., Leroueil, S. Picarelli, L., 2014, The Varnes classification of landslide types, an update: Landslides, v11, p. 167-194, doi: 10.1007/s10346-013-0436-y
Larsen, I.J., Montgomery, D.R., and Korup, O., 2010, Landslide erosion controlled by hillslope material: Nature geoscience, v.3, p. 247-251, doi: 10.1038/NGEO776
Norris, R.D., 1994, Seismicity of rockfalls and avalanches at three Cascade Range volcanoes: Implications for seismic detection of hazardous mass movements: Bulletin of the Seismological Society of America, v. 84, no. 6, p. 1925-1939.
Varnes, D.J., 1978, Slope movement types and processes. In: Schuster RL, Krizek RJ (eds) Landslides, analysis and control, special report 176: Transportation research board, National Academy of Sciences, Washington, DC., pp. 11–33.