• Krank Star@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 year ago

      I use Stable Difussion

      1000011855

      Sir mechs a lot Steps: 20, Sampler: DPM++ 3M SDE Karras, CFG scale: 7, Seed: 2748980831, Size: 768x1280, Model hash: 74dda471cc, Model: realvisxlV20_v20Bakedvae, Clip skip: 2, RNG: CPU, Version: v1.6.0

    • tal
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      1 year ago

      I suppose that if someone builds a system where there’s a LLM doing mapping not just from the spoken text, but from descriptive text to speech – like, do Tortoise TTS but with a Stable Diffusion style prompt for description, it’d be possible to hear SirMechsALot’s voice. That’d be interesting.