A Dirty Script to Download Every DJ Screw Mixtape

<2023-07-02 Sun>

I'm a huge fan of DJ Screw. His mixtapes make for superb background music.

What up HTINE? It's going DINE, baby. —DJ Screw

Back in 2021, I wrote a script to download all 300+ DJ Screw mixtapes. I had this script sitting on my computer for 2 years, but for some reason I never got around to letting it run to completion until yesterday. Now I got my 45GB of Screw tapes and there's no reason not to share my secret recipe with the world. You're welcome.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
Created on Mon Jul  5 08:14:42 2021

@author: nate

import requests
import urllib
import wget
from lxml import html
import re
import json
import sys
import time
import os.path

import subprocess

#create this bar_progress method which is invoked automatically from wget
def bar_progress(current, total, width=80):
  progress_message = "Downloading: %d%% [%d / %d] bytes" % (current / total * 100, current, total)
  sys.stdout.write("\r" + progress_message)

def download_file(url):
    local_filename = url.split('/')[-2]
    # NOTE the stream=True parameter below
    with requests.get(url, stream=True) as r:
        with open(local_filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192):
                # If you have chunk encoded response uncomment if
                # and set chunk_size parameter to None.
                #if chunk:
    return local_filename

with open("./screw.txt","r") as f:
    links ="\n")

for i in range(0,len(links)-1):
    print(f"================ {i}/{len(links)-1} =================")
    print(f"links[i]: {links[i]}")

    print("====== parsing page..... ")
    r = requests.get(links[i])
    tree = html.fromstring(r.content)
    x = tree.find_class("boxy-ttl hover-badge")
    dl_links = []
    for elt in x:
        link = "" + elt.get("href")

    print("====== Done")
    L = dl_links[1]
    L = L.replace(" ", "%20")
    print(f"Got link: {L}")


    fname = "./output/"+ L.split("/")[4] + ".zip"
    print(f"fname: {fname}")

    if os.path.isfile(fname):
        print(f"{i}/{len(links)-1}: File already exists.")

    to_run = "wget "+ L + " --output-document="+fname
    print("running command: "+to_run)


I know how to write good web scrapers in Python, and this is awful by my standards. But there's no point in fixing it because it works. To run this, you'll also need to save the following text in a file ./screw.txt:

I don't recall where I found these links. I probably saved them from some crusty old forum post. I can be highly resourceful when it comes to finding things on the internet. There's no telling for how long will generously host these files, so get them while you can.

Modified: 2023-09-30 20:15:13 EDT

Emacs 29.1.50 (Org mode 9.6.1)