k8shazgpu: using "vllm run" on Kubernetes

by dougbtv 4 months ago

GNU/Linux • xterm-256color • bash 169 views

This demonstrates using k8shazgpu on Kubernetes to perform a “vLLM run” where we launch changes to our vLLM code on a remote Kubernetes cluster, which handles all the GPU allocation for us, and we get almost instant feedback about our changes, and can even queue up runs when the server is loaded and get results when its finished.

More recordings by dougbtv

OpenShift User Defined Networking with OpenShift Virtualization 1:46

by dougbtv 1 year ago

Tutorial: A Mad Scientist's Guide to Automating CNI Configurations using Generative AI 15:26

by dougbtv 1 year ago

From CNI zero to CNI hero: The hands on demo 10:46

by dougbtv 1 year ago

https://asciinema.org/a/745629

Copied!

Append ?t=30 to start the playback at 30s, ?t=3:20 to start the playback at 3m 20s.

See sharing docs for more link customization options.

Embed as image link

Use snippets below to display a preview image linking to this recording.
Ideal for places where scripts are not allowed, such as project README files.

HTML:

<a href="https://asciinema.org/a/745629" target="_blank"><img src="https://asciinema.org/a/745629.svg" /></a>

Copied!

Markdown:

[![asciicast](https://asciinema.org/a/745629.svg)](https://asciinema.org/a/745629)

Copied!

Embed the player

If you're embedding on your own page or on a site which permits script tags, you can use the full player widget:

<script src="https://asciinema.org/a/745629.js" id="asciicast-745629" async="true"></script>

Copied!

Paste the above script tag where you want the player to be displayed on your page.

See embedding docs for more player customization options.

You can download this recording in asciicast v2 format, as a .cast file.

Download

Replay in terminal

You can replay the downloaded recording in your terminal using the asciinema play command:

asciinema play 745629.cast

Copied!

If you don't have asciinema CLI installed then see installation instructions.

Use with stand-alone player on your website

Download asciinema player from the releases page (you only need .js and .css file), then use it like this:

<!DOCTYPE html>
<html>
<head>
  <link rel="stylesheet" type="text/css" href="/assets/asciinema-player.css" />
</head>
<body>
  <div id="player"></div>
  <script src="/assets/asciinema-player.min.js"></script>
  <script>
    AsciinemaPlayer.create(
      '/assets/745629.cast',
      document.getElementById('player'),
      { cols: 209, rows: 48 }
    );
  </script>
</body>
</html>

See asciinema player quick-start guide for full usage instructions.

While this site doesn't provide GIF conversion at the moment, you can still do it yourself with the help of asciinema GIF generator utility - agg.

Once you have it installed, generate a GIF with the following command:

agg https://asciinema.org/a/745629 demo.gif

Copied!

Or, if you already downloaded the recording file:

agg demo.cast demo.gif

Copied!

Check agg --help for all available options. You can change font family and size, select color theme, adjust speed and more.

See agg manual for full usage instructions.