@webghost0101

webghost0101@sopuli.xyz · 2 days ago

Scam site?

Url tunnels you trough different sites before ending on a fullscreen foced yt looking video that you cannot click away

webghost0101@sopuli.xyz · 2 days ago

part 2

# ========================================================================
    #  Step 4 (Pass 1): Download at best quality, with a size cap
    # ========================================================================
    #  Tries: best AVC1 video + best M4A audio → merged into .mp4
    #  If a video exceeds MAX_FILESIZE, its ID is saved for the fallback pass.
    #  Members-only and premiere errors cause the video to be permanently skipped.
 
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Pass 1: best quality under $MAX_FILESIZE"
 
    yt-dlp \
        "${common_opts[@]}" \
        --match-filter "!is_live & !was_live & original_url!*=/shorts/" \
        --max-filesize "$MAX_FILESIZE" \
        --format "bestvideo[vcodec^=avc1]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
        "$URL" 2>&1 | while IFS= read -r line; do
            echo "$line"
            if echo "$line" | grep -q "^ERROR:"; then
 
                # Too large → save ID for pass 2
                if echo "$line" | grep -qi "larger than max-filesize"; then
                    vid_id=$(echo "$line" | grep -oP '(?<=\[youtube\] )[a-zA-Z0-9_-]{11}')
                    [[ -n "$vid_id" ]] && echo "$vid_id" >> "$SCRIPT_DIR/.size_failed_$Name"
 
                # Permanently unavailable → skip forever
                elif echo "$line" | grep -qE "members only|Join this channel|This live event|premiere"; then
                    vid_id=$(echo "$line" | grep -oP '(?<=\[youtube\] )[a-zA-Z0-9_-]{11}')
                    if [[ -n "$vid_id" ]]; then
                        if ! grep -q "youtube $vid_id" "$skip_file" 2>/dev/null; then
                            echo "youtube $vid_id" >> "$skip_file"
                            echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Added $vid_id to skip file (permanent failure)"
                        fi
                    fi
                fi
 
                log_error "[$(date '+%Y-%m-%d %H:%M:%S')] ${Name} - ${URL}: $line"
            fi
        done
 
    # ========================================================================
    #  Step 5 (Pass 2): Retry oversized videos at lower quality
    # ========================================================================
    #  For any video that exceeded MAX_FILESIZE in pass 1, retry at 720p max.
    #  If it's STILL too large, log the actual size and skip permanently.
 
    if [[ -f "$SCRIPT_DIR/.size_failed_$Name" ]]; then
        echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Pass 2: lower quality fallback for oversized videos"
 
        while IFS= read -r vid_id; do
            [[ -z "$vid_id" ]] && continue
            echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Retrying $vid_id at 720p max"
 
            yt-dlp \
                --proxy "$PROXY" \
                --download-archive "$archive_file" \
                --extractor-args "youtube:player-client=default,-tv_simply" \
                --write-thumbnail \
                --convert-thumbnails jpg \
                --add-metadata \
                --embed-thumbnail \
                --merge-output-format mp4 \
                --max-filesize "$MAX_FILESIZE" \
                --format "bestvideo[vcodec^=avc1][height<=720]+bestaudio[ext=m4a]/bestvideo[height<=720]+bestaudio[ext=m4a]/best[height<=720]/worst" \
                --output "$DOWNLOAD_DIR/${Name} - %(title)s.%(ext)s" \
                "https://www.youtube.com/watch?v=%24vid_id" 2>&1 | while IFS= read -r line; do
                    echo "$line"
                    if echo "$line" | grep -q "^ERROR:"; then
 
                        # Still too large even at 720p — give up and log the size
                        if echo "$line" | grep -qi "larger than max-filesize"; then
                            filesize_info=$(yt-dlp \
                                --proxy "$PROXY" \
                                --extractor-args "youtube:player-client=default,-tv_simply" \
                                --simulate \
                                --print "%(filesize,filesize_approx)s" \
                                "https://www.youtube.com/watch?v=%24vid_id" 2>/dev/null)
                            if [[ "$filesize_info" =~ ^[0-9]+$ ]]; then
                                filesize_gb=$(echo "scale=1; $filesize_info / 1073741824" | bc)
                                size_str="${filesize_gb}GB"
                            else
                                size_str="unknown size"
                            fi
                            if ! grep -q "youtube $vid_id" "$skip_file" 2>/dev/null; then
                                echo "youtube $vid_id" >> "$skip_file"
                                log_error "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Skipped $vid_id - still over $MAX_FILESIZE at 720p ($size_str)"
                            fi
                        fi
 
                        log_error "[$(date '+%Y-%m-%d %H:%M:%S')] ${Name} - ${URL}: $line"
                    fi
                done
        done < "$SCRIPT_DIR/.size_failed_$Name"
 
        rm -f "$SCRIPT_DIR/.size_failed_$Name"
    else
        echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Pass 2: no oversized videos to retry"
    fi
 
    # Clean up any stray .description files yt-dlp may have left behind
    find "$DOWNLOAD_DIR" -name "${Name} - *.description" -type f -delete
 
done

webghost0101@sopuli.xyz · 2 days ago

There is no single stop for a tutorial for stuff like this because you could use any scripting language and which ones you have available may depend on your os.

But honestly any half decent llm can generate something that works for your specific case.

If you really want to avoid using those,

Here is a simple example for windows powershell.


# yt-dlp Channel Downloader
# --------------------------
# Downloads the latest video from each channel in channels.txt
#
# Setup:
#   1. Install yt-dlp:  winget install yt-dlp
#   2. Install ffmpeg:  winget install ffmpeg
#   3. Create channels.txt next to this script, one URL per line:
#        https://www.youtube.com/@SomeChannel
#        https://www.youtube.com/@AnotherChannel
#   4. Right-click this file → Run with PowerShell

# Read each line, skip blanks and comments (#)
foreach ($url in Get-Content ".\channels.txt") {
    $url = $url.Trim()
    if ($url -eq "" -or $url.StartsWith("#")) { continue }

    Write-Host "`nDownloading latest from: $url"

    yt-dlp --playlist-items 1 --merge-output-format mp4 --no-overwrites `
        -o "downloads\%(channel)s\%(title)s.%(ext)s" $url
}

Write-Host "`nDone."

And here is my own bash script (linux) which has only gotten bigger with more customization over the years.

#!/bin/bash
# ============================================================================
#  yt-dlp Channel Downloader (Bash)
# ============================================================================
#
#  Automatically downloads new videos from a list of YouTube channels.
#
#  Features:
#    - Checks RSS feeds first to avoid unnecessary yt-dlp calls
#    - Skips livestreams, premieres, shorts, and members-only content
#    - Two-pass download: tries best quality first, falls back to 720p
#      if the file exceeds the size limit
#    - Maintains per-channel archive and skip files so nothing is
#      re-downloaded or re-checked
#    - Embeds thumbnails and metadata into the final .mp4
#    - Logs errors with timestamps
#
#  Requirements:
#    - yt-dlp       (https://github.com/yt-dlp/yt-dlp)
#    - ffmpeg        (for merging video+audio and thumbnail embedding)
#    - curl          (for RSS feed fetching)
#    - A SOCKS5 proxy on 127.0.0.1:40000 (remove --proxy flags if not needed)
#
#  Channel list format (Channels.txt):
#    The file uses a simple key=value block per channel, separated by blank
#    lines. Each block has four fields:
#
#      Cat=Gaming
#      Name=SomeChannel
#      VidLimit=5
#      URL=https://www.youtube.com/channel/UCxxxxxxxxxxxxxxxxxx
#
#    Cat       Category label (currently unused in paths, available for sorting)
#    Name      Short name used for filenames and archive tracking
#    VidLimit  How many recent videos to consider per run ("ALL" for no limit)
#    URL       Full YouTube channel URL (must contain the UC... channel ID)
#
# ============================================================================

export PATH=$PATH:/usr/local/bin

# --- Configuration ----------------------------------------------------------
# Change these to match your environment.

SCRIPT_DIR="/path/to/script"           # Folder containing this script and Channels.txt
ERROR_LOG="$SCRIPT_DIR/download_errors.log"
DOWNLOAD_DIR="/path/to/downloads"      # Where videos are saved
MAX_FILESIZE="5G"                      # Max file size before falling back to lower quality
PROXY="socks5://127.0.0.1:40000"       # SOCKS5 proxy (remove --proxy flags if unused)

# --- End of configuration ---------------------------------------------------

cd "$SCRIPT_DIR"

# ============================================================================
#  log_error - Append or update an error entry in the error log
# ============================================================================
#  If an entry with the same message (ignoring timestamp) already exists,
#  it replaces it so the log doesn't fill up with duplicates.
#
#  Usage: log_error "[2025-01-01 12:00:00] ChannelName - URL: ERROR message"

log_error() {
    local entry="$1"

    # Strip the timestamp prefix to get a stable key for deduplication
    local key=$(echo "$entry" | sed 's/^\[[0-9-]* [0-9:]*\] //')

    local tmp_log=$(mktemp)
    if [[ -f "$ERROR_LOG" ]]; then
        grep -vF "$key" "$ERROR_LOG" > "$tmp_log"
    fi
    echo "$entry" >> "$tmp_log"
    mv "$tmp_log" "$ERROR_LOG"
}

# ============================================================================
#  Parse Channels.txt
# ============================================================================
#  awk reads the key=value blocks and outputs one line per channel:
#    Category  Name  VidLimit  URL
#  The while loop then processes each channel.

awk -F'=' '
  /^Cat/ {Cat=$2}
  /^Name/ {Name=$2}
  /^VidLimit/ {VidLimit=$2}
  /^URL/ {URL=$2; print Cat, Name, VidLimit, URL}
' "$SCRIPT_DIR/Channels.txt" | while read -r Cat Name VidLimit URL; do

    archive_file="$SCRIPT_DIR/DLarchive$Name.txt"   # Tracks successfully downloaded video IDs
    skip_file="$SCRIPT_DIR/DLskip$Name.txt"          # Tracks IDs to permanently ignore
    mkdir -p "$DOWNLOAD_DIR"

    # ========================================================================
    #  Step 1: Check the RSS feed for new videos
    # ========================================================================
    #  YouTube provides an RSS feed per channel at a predictable URL.
    #  Checking this is much faster than calling yt-dlp, so we use it
    #  as a quick "anything new?" test.

    # Extract the channel ID (starts with UC) from the URL
    channel_id=$(echo "$URL" | grep -oP 'UC[a-zA-Z0-9_-]+')
    rss_url="https://www.youtube.com/feeds/videos.xml?channel_id=%24channel_id"

    # Fetch the feed and pull out all video IDs
    new_videos=$(curl -s --proxy "$PROXY" "$rss_url" | \
        grep -oP '(?<=<yt:videoId>)[^<]+')

    if [[ -z "$new_videos" ]]; then
        echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] RSS fetch failed or empty, skipping"
        continue
    fi

    # Compare RSS video IDs against archive and skip files.
    # If every ID is already known, there's nothing to do.
    has_new=false
    while IFS= read -r vid_id; do
        in_archive=false
        in_skip=false

        [[ -f "$archive_file" ]] && grep -q "youtube $vid_id" "$archive_file" && in_archive=true
        [[ -f "$skip_file" ]]    && grep -q "youtube $vid_id" "$skip_file"    && in_skip=true

        if [[ "$in_archive" == false && "$in_skip" == false ]]; then
            has_new=true
            break
        fi
    done <<< "$new_videos"

    if [[ "$has_new" == false ]]; then
        echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] No new videos, skipping"
        continue
    fi

    echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] New videos found, processing"

    # ========================================================================
    #  Step 2: Build shared option arrays
    # ========================================================================

    # Playlist limit: restrict how many recent videos yt-dlp considers
    playlist_limit=()
    if [[ $VidLimit != "ALL" ]]; then
        playlist_limit=(--playlist-end "$VidLimit")
    fi

    # Options used during --simulate (dry-run) passes
    sim_base=(
        --proxy "$PROXY"
        --extractor-args "youtube:player-client=default,-tv_simply"
        --simulate
        "${playlist_limit[@]}"
    )

    # Options used during actual downloads
    common_opts=(
        --proxy "$PROXY"
        --download-archive "$archive_file"
        --extractor-args "youtube:player-client=default,-tv_simply"
        --write-thumbnail
        --convert-thumbnails jpg
        --add-metadata
        --embed-thumbnail
        --merge-output-format mp4
        --output "$DOWNLOAD_DIR/${Name} - %(title)s.%(ext)s"
        "${playlist_limit[@]}"
    )

    # ========================================================================
    #  Step 3: Pre-pass — identify and skip filtered content
    # ========================================================================
    #  Runs yt-dlp in simulate mode twice:
    #    1. Get ALL video IDs in the playlist window
    #    2. Get only IDs that pass the match-filter (no live, no shorts)
    #  Any ID in (1) but not in (2) gets added to the skip file so future
    #  runs don't waste time on them.

    echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Pre-pass: identifying filtered videos (live/shorts)"

    all_ids=$(yt-dlp "${sim_base[@]}" --print "%(id)s" "$URL" 2>/dev/null)
    passing_ids=$(yt-dlp "${sim_base[@]}" \
        --match-filter "!is_live & !was_live & original_url!*=/shorts/" \
        --print "%(id)s" "$URL" 2>/dev/null)

    while IFS= read -r vid_id; do
        [[ -z "$vid_id" ]] && continue
        grep -q "youtube $vid_id" "$archive_file" 2>/dev/null && continue
        grep -q "youtube $vid_id" "$skip_file"    2>/dev/null && continue
        if ! echo "$passing_ids" | grep -q "^${vid_id}$"; then
            echo "youtube $vid_id" >> "$skip_file"
            echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Added $vid_id to skip file (live/short/filtered)"
        fi
    done <<< "$all_ids"

    # ========================================================================
    #  Step 4 (Pass 1): Download at best quality, with a size cap
    # ========================================================================
    #  Tries: best AVC1 video + best M4A audio → merged into .mp4
    #  If a video exceeds MAX_FILESIZE, its ID is saved for the fallback pass.
    #  Members-only and premiere errors cause the video to be permanently skipped.

    echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$Name] Pass 1: best quality under $MAX_FILESIZE"

    yt-dlp \
        "${common_opts[@]}" \
        --match-filter "!is_live & !was_live & original_url!*=/shorts/" \
        --max-filesize "$MAX_FILESIZE" \
        --format "bestvideo[vcodec^=avc1]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
        "$URL" 2>&1 |

webghost0101@sopuli.xyz · 2 days ago

Its an open source tool to download youtube videos

About every mainstream youtube download program you or your parents have ever used are actually just a wrapper for this.

Bonus: If you want to learn more about coding its not that hard to make a script that automatically downloads the last video from a list of channels that runs on a schedule. Even ai can do it.

webghost0101@sopuli.xyz · 2 days ago

I am fortunate we have enough mental energy to go to stores to buy ingredients to cook our own meals most days.

Its similar, to what is referred to as “adhd-tax” I am still in this picture more then i would like.

webghost0101@sopuli.xyz · 4 days ago

“to craft” is key here.

webghost0101@sopuli.xyz · 5 days ago

Smart kid, Now you will ask 19 bush and plant related questions while the answer is you.

webghost0101@sopuli.xyz · 5 days ago

I love the concept, i have even been working on something similar but, big buts…

Recommend ubuntu? While many are moving away from it.

Ai chat with ollama as a prominent feature? Controversy aside, this survival computer better packs some hardware, which may cost more precious possibly limited power.

Note taking app? Besides the intention to run it on ubuntu which i presume already includes something to work with markdown… any computer with a terminal can make notes as far as i know.

Hardware scoring and community leaderboard? wtaf

Things like offline wikipedia in kiwix are indeed pretty cool but in general the way this software describes itself feels sloppy and based more on vibes then anything though out.

webghost0101@sopuli.xyz · 5 days ago

Where are all the “respect Charlie Kirk” people at?

webghost0101@sopuli.xyz · 7 days ago

Card against humanity… maybe…

webghost0101@sopuli.xyz · 8 days ago

Free movie editing on windows peaked on fucking vista.

webghost0101@sopuli.xyz · 8 days ago

Is there a joke here or is it because i am going to a funeral this week that i can’t find it?

webghost0101@sopuli.xyz · 9 days ago

Meta ai not bothering with user consent is very on brand. It learned from the best.

webghost0101@sopuli.xyz · 9 days ago

You can tell this is accurate by the way they hold their tummy like they feel sick from constant stress.

webghost0101@sopuli.xyz · 9 days ago

Disgusting for sure but thats a really bad argument to make something illegal. It’s the same rhetoric used to ban queer sexualities.

The generative ai is often based on real stuff and regularly ends up being deepfakes of real people who are affected, thats not victimless.

webghost0101@sopuli.xyz · 12 days ago

I am not talking about people who have easy acces to processed foods.

I don’t have exact stats nor know how big% of the world that is but people going to bed hungry and being underweight is absolutely still happening.

webghost0101@sopuli.xyz · 12 days ago

If that 60% of people who aren’t weren’t occupied with working hard not to starve they might take offense to your use of “we” (Satire)

webghost0101@sopuli.xyz · 12 days ago

I would still consider myself a noob but i do feel accomplished enough to answer this properly.

Hardware depends on your budget. It does not need to be bleeding edge either, i would focus on a good server case that makes it easy to upgrade over time and maybe fits a few harddrives if you don’t plan on having a nas.

Also make sure to check how much sata connections your motherboard can handle, using an m.2 slots may occupy some of the physical sata connections.

I highly, highly recommend proxmox for an OS.

You can set up every different service into its own lxc container, its wonderful to know you can experiment with whatever and everything else will be unaffected and just keep working. Within lxc things can just run using docker (though this is officially not recommended it works fine). The resource sharing between lxc containers is excellent. Taking snapshots a breeze. And when an lxc is not enough you can easily spin up some vm with whatever distro or even windows also. Best server-choice i made ever!

The zfs format for your storage pool is also very good. And you definitely want redundancy, redundancy makes it so x amount of drives can fail and the system just keeps running like normal while you replace the broken drive, otherwise a single drive failing ruins all your data.

Unless you make every drive its own pool with specific items that you backup separately but thats honestly more troublesome then learning how to setup a pool.

How you want a pool and how much redundancy is a personal choice but i can tell you how i arranged mine.

I have 5 identical drives which is the max My system can handle. 4 of them are in a pool with a raidz1 configuration (equivalent to raid-5) this setup gives me 1 drive of redundancy and leaves me 3 drives of actual usable space.

I could have added the fifth drive in the pool fo more but i opted not too, to protect my immich photons against complete critical failure. This fifth drive is unmounted when not used.

Basically my immich storage are in a dataset, which you can think of as a directory on your pool that you can assign to different lxc to keep things separate.

Every week a script will mount the fifth drive, rsync copy my immich dataset from the pool onto it. Unmount the drive again. Its a backup of the most important stuff outside of the pool.

This drive can also be removed from the cases front in an emergency, which is part of why I recommend spending some time finding a case that fits your wants more then worrying about how much ram.

Best of luck!

webghost0101@sopuli.xyz · 15 days ago

I stream from self hosted sources, best of all worlds. No enshitification.

New media is acquired for free from the public libraries and then ripped, which under my local laws is perfectly legal.

webghost0101@sopuli.xyz · 15 days ago

Its another thing to literally be so autistic that the extremely bad repercussions simply don’t matter.

It wouldn’t really be a choice on my part. If society finds my existence so offensive they execute me for it thats a problem for their conscious not mine, i will die some day anyway and i am content to do so for not betraying what i know is right.

Feels bad for my family though but being an example of making unethical compromises is worse.