Künstliche Intelligenz

Trackbacks

Kommentare

Kommentar schreiben

Geschrieben von Jan am Freitag, 13. Dezember 2024 um 09:36 Uhr

Yesterday I learned something. And whenever I learn an important lessen, I try to do my write-up in English, because a) it is an opportunity for me to use my rusty language skills and b) I have to really think about the things I write down. Thinking is important. But we will come to that later!

I have used ChatGPT to write a more or less simple programme, a bash script that outputs a PDF file of all the images it finds in a directory, arranged in a grid, like an overview or one of these contact sheets you used to get from the photo shop back in the dark ages before computers were a thing. Just for fun I used an LLM because I am curious to find out new things. My experience was, from a programmers point of view, somewhat OKish. I didn’t expect much more than what I got.

People who know me also know, that I despise AI generated content. I think it is theft of intellectual property for making some shareholder a quick buck. I think it desensitises us, you and me and all of humanity, to reality. It insulates us from the hard cold facts by creating a dream world that does not exist. And I think that it makes people believe they are artists or musicians or even programmers when in fact they are not. So, here is what I learned.

Over the past few months, since AI became the next big buzzword in an industry ruled over by idiots, I have had several run-ins with people who claim that AI is a great tool to get you started, to set you on the path to something new. So I deliberately chose a thing I am not particularly good at: a bash script. Bash is, for those who don't speak Computer, one of - if not the - most used shell in Linux and other Unix like operating systems. The old folks may remember when they had to type in commands into DOS to get it to start their favourite word processer. This is like that, but only it's not. Bash - or any other command line interpreter in Linux - and the associated tools you find on any Unix like machine is so much more. It is like a programming language on it's own. It can do amazing stuff. It can automate tedious tasks and run them in the background where no-one ever notices them, or it can create beautiful works of art and entertainment. And it can do harm. It can bring down your machine no questions asked and destroy you, your family, and your family’s pet at an instant because you misspelled one command line argument.

I am not good at bash. In my opinion it is a mess of conflicting specifications. That may be because it is rooted in Unix systems of the 1970s, when things were done wild west style - shoot first, ask questions later. I much prefer "real", "modern" languages with training wheels like C++, PHP or even Java. But for my little project, bash seemed appropriate. So I asked Mr. GPT: "Hello, can you write me a bash script that has the following function: There are between 30 and 40 image files in a directory. A PDF file is to be created from these, which contains a preview of these files on a DIN A4 page in a 4 by 5 grid with file names as subtitles."[1] Sounds simple enough and in any language I know what I would do: Get the directory listing, parse it for image files, get them prepared by resizing and put them in a grid, which then needs to be converted into a PDF file via some tools or libraries that already exist. A lifetime ago I wrote something similar in Haskell, for the love of God, not Haskell, why, oh why?!

And ChatGPT gives me exactly that, in a more or less well documented file. But, even at first glance, you get the feeling that something is not quite right. It all looks so clean. I looks so professional. This is a rough first draft and I don't expect it to be perfect, but the AI behind it all presents it to me like it is the final product, tries to explain what it has done, and - like the good dog it pretends to be - asks me for a treat. Instead I tell it to modify the code so that I can specify the input directory from the command line and dutifully it obeys. Then I pretend I'm dumb (like 99% of people probably are, me included) and tell it that I get a warning message that "convert" (the tool it uses to resize images) is deprecated and one should use "magick" now. This is a fairly new development and I, too, always forget this. No harm, no foul, Mr. GPT does what I tell it to and replaces all instances of convert in the code.

Next: I want them all! Regardless of what spelling is used in the file extension. You see, on Linux file systems "JPG" is not the same as "jpg" or "Jpg". Poor old DOS never learned that particular trick and 45 years later Windows still can't. It tries to hide this with all sorts of trickery, but in the end, there can only be one ring to rule them all. (It is not so much a restriction of the file system itself - NTFS can have files with different capitalizations coexisting happily side by side - but of the way they are accessed.) Anyways, here is where Mr. GPT surprises me: Instead of filtering for more and different spellings of file extensions like I would have expected, it uses the build in "shopt" command to tell the bash instance executing this script to ignore cases from now on. I consider this cheating. It basically dumbs down bash to the level of DOS. (BTW, the "correct" thing to do - in my opinion - would be to ask "file" for help in identifying which files really are images and which are not. Because, there is always someone who saves a JPG as a TXT and vice versa!

) But, for now, let's just keep this modification and see what happens next.

Mind you, until now the PDF this first draft script produces has all the images in one line at the bottom of the page. No grid in sight. To be honest, at this point I don't know myself what's going on. So I tell it to do something about it. And here you can clearly see that the Intelligence part of AI is just a simulation, fakery, some sophisticated algorithm that pretends to be intelligent but actually is not. It has all the information and the code is even written by itself, yet it does not understand. All it has done is copy pieces of code that other people wrote into one file and hope that noone notices that there is no logic or glue holding everything together. After a while and some debugging outputs it dawns on me and I tell it where the thing it wrote makes a boo-boo, but not why. And suddenly it gets it: COLUMNS is a reserved variable in bash, storing the width of the terminal it is running in, so when you set it to something else it retains that value for a while until it is reset by something - most likely when some external programme is called (like magick). Could any human programmer have made this mistake? Absolutely. Especially someone like me who is not very well versed in bash scripting. But should ChatGPT have caught that mistake? Should it even have made it in the first place? BTW: One can use shopt to suppress this behaviour. For someone so keen on using this command it should have known, because it is in the freaking documentation! But it chooses the clean way and renames COLUMNS and ROWS to something better.

Slowly the PDF takes shape. But 4 by 5 is not 40, like I asked for initially. It would take two pages to get them all. So I tell it and it adds another loop for more images. Looks OK. Wasn't the most difficult request, so I didn't expect any problems here. But I would have expected some mathematical capabilities from the start! 4 times 5 is clearly 20! Any seven year old kid still in elementary school would have been able to catch that mistake.

Now for something more problematic: I want all the images to take up about the same space. At the moment they only get resized in one dimension which means that those in panorama orientation are somewhat small but those in portrait orientation are HUGE! Spoiler: One trick I normally use would be to put each image into a square background first and then resize them. So, let's say, an image of dimensions 3840x2560 would be expanded to 3840x3840 with white borders on the top and bottom, and then be resized to whatever is needed for the script. Likewise an image of 2560x3840 would also end up as a 3840x3840 image first and then be resized. This way any aspect ratio would be kept, but all images would appear to be similar in size.

This task needs creativity. It is a roundabout way to get to a not so obvious result. And as you can imagine: ChatGPT fails. I mean, I could try and hint it in the right direction for some more iterations of its code, but I have better things to do so I abandon this problem for the moment. I don't believe without my help it would ever get an idea of what to do or even what I want it to do. We are a long way away from the self programming holodecks of the Star Trek future!

Everything else I ask it for this evening is more or less prettifying the result. There is no real intelligence needed for that, just fine tuning some parameters. After that I consider "my" script complete for now.

So, how was the experience? Weeeell. You know when you ask some AI generator for the picture of a person and then it has six fingers on one hand and three eyes? It's not that bad anymore, but you get the distinct feeling that you have to hold it's hand all throughout the process because it can trip at any second. It's like having a puppy and training it to not do it's business on the carpet. It's not mean, it just doesn't know any better. But it means you have to be on high alert all the time. Also: because all the AI does is copy pre-existing work from random sources you will never get a consistent result. The programming style is all over the place! It is like a painting that has been painted by a hundred different artists. Which is ironic, because in the end this script has only about 90 lines of code!

Yes, it is true: AI may get you started when you have no clue. It may write the boring pieces of code every programmer has to have somewhere, the bits and bolts that keep the thing running but are always the same, like setting up variables and stuff. But real creativity? I didn't see any of that. Inspiration? Nope. The code is sterile, just like AI generated "art" looks and sounds and reads like.

"Well, at least it works, right?" some might say. But to get it there it took me the better part of three hours! I could have written it in about the same amount of time all by myself, the old fashioned way, reading the documentation, and I would have learned stuff along the way. Now I have a programme, and if I wasn't a programmer myself and knew how things work, I would still have no clue as to what is going on. So no, it didn't save me any time and I missed an opportunity to improve myself.

Also, it doesn't! If I had not been there, holding it's hand, guiding it along the way, there would be no working script in the end. It would have been perfectly happy with it's first draft. Which had multiple problems. Even now there is no error checking, not even an idea of making sure that what goes into the script is indeed an image or what goes out is a valid PDF. It just trusts everyone and everything. Like that wide eyed puppy it is, the one that just made a mess on the floor.

Conclusion: Don't use AI on topics that you don't know anything about. Verify with other sources. You should never believe anything you read on the internet, and that goes for AI, too. Especially for AI, because it was trained using the internet, which you shouldn't have trusted in the first place! If you do use AI in a creative setting such as programming, and if you do so on a professional level, I hope you know what you are doing! Just imagine some dude selling this kind of crap to his employer - and please keep in mind, most employers don't give a flying F* about the code you write as long as they can earn money using it - , an employer who builds e.g. the black box that in some hospital room keeps your granny alive or controls the nuclear power plant next door or decides which terrorist to airstrike next with an unmanned drone. I think "the AI made me do it" is not a valid excuse in these cases.

Yesterday I learned something: billions of years of evolution in brains beats AI even at simple tasks.

[1] Actually I asked "Hallo. Kannst Du mir ein bash Script schreiben, das folgende Funktion ausführt: In einem Verzeichnis liegen zwischen 30 und 40 Bilddateien. Aus diesen soll eine PDF-Datei erstellt werden, welche auf einer DIN A4 Seite eine Vorschau dieser Dateien in einem 4 mal 5 Grid enthält mit Dateinamen als Untertitel." But you know, Englisch!

Here is what I got in the end, with some modifications I did myself: (Last Update: 16.12.2024.)

#!/bin/bash



# Hilfe anzeigen, wenn kein Verzeichnis angegeben wird

if [ "$#" -lt 1 ]; then

  echo "Verwendung: $0  [Ausgabedatei] [Optionen]"

  echo "  : Verzeichnis, das die Bilder enthält"

  echo "  [Ausgabedatei]: (optional) Name der Ausgabedatei (Standard: output.pdf)"

  echo "  Optionen:"

  echo "    --columns=N      Anzahl der Spalten im Grid (Standard: 4)"

  echo "    --rows=N         Anzahl der Reihen im Grid (Standard: 6)"

  echo "    --margin=N       Randgröße in Pixeln (Standard: 600)"

  echo "    --annotation=N   Höhe der Beschriftung in Pixeln (Standard: 30)"

  echo "    --dpi=N          Auflösung in DPI (Standard: 600)"

  echo "    --include-path   Pfad des Eingabeverzeichnisses oben auf jeder Seite anzeigen"

  exit 1

fi



# Standardwerte für die Optionen

GRID_COLUMNS=4

GRID_ROWS=6

MARGIN=600

ANNOTATIONHEIGHT=30

DPI=600

INCLUDE_PATH=false



# Verzeichnis und Ausgabedatei setzen

INPUT_DIR="$1"

shift

OUTPUT_PDF="output.pdf"



if [[ "$1" != "" && "$1" != --* ]]; then

  OUTPUT_PDF="$1"

  shift

fi



# Optionen verarbeiten

while [[ "$1" == --* ]]; do

  case "$1" in

    --columns=*)

      GRID_COLUMNS="${1#*=}";;

    --rows=*)

      GRID_ROWS="${1#*=}";;

    --margin=*)

      MARGIN="${1#*=}";;

    --annotation=*)

      ANNOTATIONHEIGHT="${1#*=}";;

    --dpi=*)

      DPI="${1#*=}";;

    --include-path)

      INCLUDE_PATH=true;;

    *)

      echo "Unbekannte Option: $1"

      exit 1;;

  esac

  shift

done



# Überprüfen, ob das Eingabeverzeichnis existiert

if [ ! -d "$INPUT_DIR" ]; then

  echo "Das Eingabeverzeichnis $INPUT_DIR existiert nicht!"

  exit 1

fi



# Absoluten Pfad des Eingabeverzeichnisses ermitteln

if [[ "$INPUT_DIR" == /* ]]; then

  FULL_INPUT_PATH="$INPUT_DIR"

else

  FULL_INPUT_PATH="$(cd "$INPUT_DIR" && pwd)"

fi



# Temporäre Verzeichnisse erstellen

TMP_DIR=$(mktemp -d)

PAGE_FILES=()



# DPI-basierte Größen berechnen (Umrechnung von mm in Pixel)

MM_TO_INCH=0.0393701

PAGE_WIDTH=$(echo "$DPI * 210 * $MM_TO_INCH" | bc | awk '{printf "%.0f", $0}')   # 210 mm für DIN A4

PAGE_HEIGHT=$(echo "$DPI * 297 * $MM_TO_INCH" | bc | awk '{printf "%.0f", $0}') # 297 mm für DIN A4



# Verdoppelte Ränder berechnen

ADJUSTED_MARGIN=$((MARGIN))



# Berechnung der Bildgröße

THUMB_WIDTH=$(( (PAGE_WIDTH - ADJUSTED_MARGIN) / GRID_COLUMNS ))



# Zusätzlicher Platz für Titelzeile berücksichtigen

if [ "$INCLUDE_PATH" = true ]; then

  TOTAL_ANNOTATION_HEIGHT=$((ANNOTATIONHEIGHT * (GRID_ROWS + 1))) # Zusätzliche Zeile für Titel

else

  TOTAL_ANNOTATION_HEIGHT=$((ANNOTATIONHEIGHT * GRID_ROWS)) # Nur Bild-Beschriftungen

fi



THUMB_HEIGHT=$(( (PAGE_HEIGHT - TOTAL_ANNOTATION_HEIGHT - ADJUSTED_MARGIN) / GRID_ROWS ))

MAX_IMAGES=$((GRID_COLUMNS * GRID_ROWS)) # Max. Bilder pro Seite



# Funktion, um Dateinamen zu kürzen

shorten_filename() {

  local filename="$1"

  local base="${filename%.*}" # Entferne die Dateiendung

  local parts=( ${base//-/ } ) # Teile den Namen an den Bindestrichen



  # Behalte die ersten und letzten Segmente, füge "..." dazwischen ein

  local shortened="${parts[0]} - ${parts[-1]}"

  echo "$shortened"

}



shorten_data() {

  local filename="$1"

  local base="${filename%.*}" # Entferne die Dateiendung

  local parts=( ${base//-/ } ) # Teile den Namen an den Bindestrichen



  parts=("${parts[@]:1}")

  unset "parts[${#parts[@]}-1]"

  local shortened=""

  for a in "${parts[@]}"; do shortened="${shortened} ${a}"; done



  echo "$shortened"

}



# Erzeuge Vorschau-Bilder mit gekürzten Dateinamen

echo "Erstelle Vorschaubilder..."

shopt -s nullglob nocaseglob  # Aktiviere Groß-/Kleinschreibung und ignoriere leere Ergebnisse

for IMAGE in "$INPUT_DIR"/*.{jpg,jpeg,png,gif}; do

  if [ -f "$IMAGE" ]; then

    FILE_TYPE=$(file --mime-type -b "$IMAGE")

    if [[ "$FILE_TYPE" == image/* ]]; then

      BASENAME=$(basename "$IMAGE")

      SHORTENED_NAME=$(shorten_filename "$BASENAME")

      SHORTENED_EXIF=$(shorten_data "$BASENAME")

      magick "$IMAGE" -resize "${THUMB_WIDTH}x${THUMB_HEIGHT}" \

        -gravity center -background white -extent "${THUMB_WIDTH}x${THUMB_HEIGHT}" \

        -gravity northwest -background white -splice 0x50 \

        -pointsize "$ANNOTATIONHEIGHT" -annotate +0+5 "$SHORTENED_NAME" \

        -gravity southeast -background white -splice 0x50 \

        -pointsize "$ANNOTATIONHEIGHT" -annotate +0+5 "$SHORTENED_EXIF" \

        "$TMP_DIR/$(basename "$IMAGE")"

    else

      echo "Überspringe $IMAGE: Kein gültiges Bild."

    fi

  fi

done

shopt -u nullglob nocaseglob  # Zurücksetzen der Optionen



# Aufteilen in Seiten

echo "Bilder auf Seiten aufteilen..."

IMAGE_LIST=("$TMP_DIR"/*.{jpg,jpeg,png,gif})

TOTAL_IMAGES=${#IMAGE_LIST[@]}

PAGE_COUNT=$(( (TOTAL_IMAGES + MAX_IMAGES - 1) / MAX_IMAGES ))



for ((PAGE=0; PAGE
  START=$((PAGE * MAX_IMAGES))

  END=$((START + MAX_IMAGES - 1))

  if [ "$END" -ge "$TOTAL_IMAGES" ]; then END=$((TOTAL_IMAGES - 1)); fi



  PAGE_IMAGES=("${IMAGE_LIST[@]:$START:$((END - START + 1))}")

  PAGE_FILE="$TMP_DIR/page_$PAGE.jpg"



  magick montage "${PAGE_IMAGES[@]}" \

    -tile "${GRID_COLUMNS}x${GRID_ROWS}" \

    -geometry "${THUMB_WIDTH}x${THUMB_HEIGHT}+10+10" \

    -gravity center -bordercolor black -border 2 \

    -background white "$PAGE_FILE" 2>/dev/null



  # Optional: Pfad hinzufügen

  if [ "$INCLUDE_PATH" = true ]; then

    magick "$PAGE_FILE" -gravity north -background white -splice 0x50 \

      -pointsize "$ANNOTATIONHEIGHT" -annotate +0+15 "$FULL_INPUT_PATH" \

      "$PAGE_FILE"

  fi



  ODDEVEN=$((PAGE % 2))

  if [ $ODDEVEN -eq "0" ]; then

    magick "$PAGE_FILE" -gravity east -extent "${PAGE_WIDTH}x${PAGE_HEIGHT}" "$PAGE_FILE";

  else

    magick "$PAGE_FILE" -gravity west -extent "${PAGE_WIDTH}x${PAGE_HEIGHT}" "$PAGE_FILE";

  fi



  PAGE_FILES+=("$PAGE_FILE")

done



# Alle Seiten zu einer PDF zusammenfügen

echo "Erstelle finale PDF..."

magick convert "${PAGE_FILES[@]}" -page A4 "$OUTPUT_PDF" 2>/dev/null



# Temporäre Dateien entfernen

rm -r "$TMP_DIR"



echo "Fertig! Die PDF wurde unter $OUTPUT_PDF gespeichert."

(Alleine schon dieses Rumgequassel, was dieses Script von sich gibt! Wenn es kein Fehler ist, der debuggt werden möchte, gehört es nicht in stdout! Ein Shellscript dieser Art sollte meiner Meinung nach einfach ohne Meldung terminieren, wenn alles OK war.)

Trackback-URL für diesen Eintrag

Keine Trackbacks

Ansicht der Kommentare: Linear | Verschachtelt

Noch keine Kommentare

Dezember '25

Vorwärts →

Name

E-Mail

Homepage

Kommentar

Antwort zu

Umschließende Sterne heben ein Wort hervor (*wort*), per _wort_ kann ein Wort unterstrichen werden.

Standard-Text Smilies wie :-) und ;-) werden zu Bildern konvertiert.

Um maschinelle und automatische Übertragung von Spamkommentaren zu verhindern, bitte die Zeichenfolge im dargestellten Bild in der Eingabemaske eintragen. Nur wenn die Zeichenfolge richtig eingegeben wurde, kann der Kommentar angenommen werden. Bitte beachten Sie, dass Ihr Browser Cookies unterstützen muss, um dieses Verfahren anzuwenden.
CAPTCHA

Hier die Zeichenfolge der Spamschutz-Grafik eintragen:

Ich stimme zu, dass meine Daten gespeichert werden dürfen. Weitere Einzelheiten und Informationen siehe Datenschutzerklärung / Impressum.

Formular-Optionen

Daten merken?

Kommentare werden erst nach redaktioneller Prüfung freigeschaltet!