Shawn M. Jones, PhD / @shawnmjones@hachyderm.io (@shawnmjones) 's Twitter Profile
Shawn M. Jones, PhD / @[email protected]

@shawnmjones

Find me @joinmastodon 🐘: @[email protected] | @bluesky 🌌: @shawnmjones.org | Scientist @LosAlamosNatLab | @WebSciDL 4Life

ID: 22826489

linkhttp://www.shawnmjones.org/ calendar_today04-03-2009 20:13:24

18,18K Tweet

784 Followers

671 Following

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

Web archive collection summarization is a topic we (WS-DL Group, ODU CS) have been interested in for a while. Shawn M. Jones, PhD / @[email protected]'s dissertation created the Dark and Stormy Archives (DSA) Framework, which surfaced a small number of exemplars for summarization. oduwsdl.github.io/dsa/

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

DSA was created pre-LLM, so the tools were not available then, and perhaps more importantly, the prevalence and users' acceptance of the "chat" modality. A hybrid (LLM & exemplar) model would be an exciting avenue of study. blog: lil.law.harvard.edu/blog/2024/02/1…

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

Some webpages are immortal, but most are ephemeral. Our preliminary report on our study of 27M webpages archived by @WaybackMachine. WS-DL Group, ODU CS Internet Archive Filecoin Foundation ODU Computer Science 🧵 1/ ws-dl.blogspot.com/2024/09/2024-0…

Some webpages are immortal, but most are ephemeral.

Our preliminary report on our study of 27M webpages archived by @WaybackMachine.

<a href="/WebSciDL/">WS-DL Group, ODU CS</a> <a href="/internetarchive/">Internet Archive</a> <a href="/FilFoundation/">Filecoin Foundation</a> <a href="/oducs/">ODU Computer Science</a> 
 
🧵 1/

ws-dl.blogspot.com/2024/09/2024-0…
Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

We sampled from 25 years of data from Wayback, collecting about 1M URLs that were first archived in each year between 1996-2021. Then we re-crawled them in 2023. Authors: Kritika garg Sawood Alam Dietrich Ayala Michele Weigle Michael L. Nelson 2/

We sampled from 25 years of data from Wayback, collecting about 1M URLs that were first archived in each year between 1996-2021. Then we re-crawled them in 2023.

Authors: <a href="/kritika_garg/">Kritika garg</a> <a href="/ibnesayeed/">Sawood Alam</a> Dietrich Ayala <a href="/weiglemc/">Michele Weigle</a> <a href="/phonedude_mln/">Michael L. Nelson</a> 

2/
Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

Our high-level results: * Only 35.3% of the webpages were still alive in 2023 * The median lifespan of a URL (for those dead in 2023) is **2.3 years** 3/

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

We found that 83% of our sample was not crawled at all in 2023. If Wayback stops crawling a URL, it's likely to have died. In fact, if a URL has not been archived successfully for many years, most likely it's dead. 4/

We found that 83% of our sample was not crawled at all in 2023. If Wayback stops crawling a URL, it's likely to have died. In fact, if a URL has not been archived successfully for many years, most likely it's dead.

4/
Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

We found, not surprisingly, that the lifespan of root URLs was much longer than that of deeplinks. * root: 10% died within a year, ~50% lived > 10 years * deep: 42% died within a year 5/

We found, not surprisingly, that the lifespan of root URLs was much longer than that of deeplinks.

* root: 10% died within a year, ~50% lived &gt; 10 years

* deep: 42% died within a year

5/
ODU College of Sciences (@odusci) 's Twitter Profile Photo

As the road to earning a Ph.D. is long but well worth the wait, the Women in Science & Engineering (WISE) took time to meet and build community and belonging. Support is important and these #astounding ladies are resilient and purpose driven! #WOMENINSTEM #Monarchpride

As the road to earning a Ph.D. is long but well worth the wait, the Women in Science &amp; Engineering (WISE) took time to meet and build community and belonging. Support is important and these #astounding ladies are resilient and purpose driven! #WOMENINSTEM #Monarchpride
Lamia Salsabil (@liya_lamia) 's Twitter Profile Photo

I am excited to share that I presented our paper, "ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations," at the ETD 2024 conference in Livingstone, Zambia! #ETD2024 WS-DL Group, ODU CS Jian Wu Bill Ingram ⛵️

I am excited to share that I presented our paper, "ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations," at the ETD 2024 conference in Livingstone, Zambia! #ETD2024 <a href="/WebSciDL/">WS-DL Group, ODU CS</a> <a href="/fanchyna/">Jian Wu</a> <a href="/sudobear/">Bill Ingram ⛵️</a>
Michele Weigle (@weiglemc) 's Twitter Profile Photo

I'm migrating my social media presence away from here and over to Mastodon (digipres.club/@weiglemc) and LinkedIn (linkedin.com/in/michele-wei…). These links and more are on my ODU Computer Science webpage at cs.odu.edu/~mweigle/ (redirects to weiglemc.github.io). WS-DL Group, ODU CS

Himarsha R. Jayanetti (@himarshaj) 's Twitter Profile Photo

Our hard work paid off😄🥳 So glad we went for it, even with all the other stuff on our plates! A big thanks to Computer Science Graduate Society - ODU for organizing this event! 👏 Kritika garg Kumushini Thennakoon #AllGirlsTeam #girlpower #WomenInSTEM #Steminist #CSGSHackathon2024 ODU Computer Science Old Dominion University ODU College of Sciences

Yasasi (@yasasi_abey) 's Twitter Profile Photo

This summer, I had the opportunity to participate in the COVES Policy Fellowship (COVES Fellowship) hosted by VASEM. During the fellowship, I worked at Joint Commission on Technology and Science (JCOTS). nirds-lab WS-DL Group, ODU CS ODU Computer Science ODU College of Sciences Blog: ws-dl.blogspot.com/2024/11/2024-1…

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

In a guest WS-DL Group, ODU CS post, Herbert Van de Sompel (@hvdsompl) discusses the recent move to register "doi:" as a URI scheme with IANA. TL;DR: It's likely more about branding than actual utility. ws-dl.blogspot.com/2024/11/2024-1…

Michael L. Nelson (@phonedude_mln) 's Twitter Profile Photo

I don't remember this game, but back in the day I typed in my share of source code from magazines like ANALOG, Antic, and SoftSide. Nostalgia aside, it was really awful: it took forever, saving to tape was slow, & most of the games were bad. But what else were you going to do?