PersonaPlex-7B MLX 8-bit

PersonaPlex 7B full-duplex speech-to-speech model converted to MLX safetensors with 8-bit quantization for Apple Silicon.

Converted from nvidia/personaplex-7b-v1 (based on Kyutai Moshi architecture).

Swift inference: soniqo/speech-swift

Model Details

Component Architecture Size
Temporal Transformer 32-layer, 4096d, 32 heads (7B params) ~6.5 GB (8-bit)
Depformer 6-layer, 1024d, 16 heads, per-codebook weights ~1.3 GB (8-bit)
Mimi Codec SEANet encoder/decoder + 8L transformer + 16 RVQ codebooks ~370 MB (fp16)
Embeddings Text + 16 audio embeddings + output heads ~940 MB (fp16)
Total ~9.1 GB

Usage

let model = try await PersonaPlexModel.fromPretrained(
    modelId: "aufklarer/PersonaPlex-7B-MLX-8bit"
)
let response = model.respond(audio: samples, voice: .NATF0, steps: 100)
audio personaplex input.wav --model aufklarer/PersonaPlex-7B-MLX-8bit -o output.wav

Variants

Variant Quantization Size Model ID
4-bit 4-bit ~4.9 GB aufklarer/PersonaPlex-7B-MLX-4bit
8-bit 8-bit ~9.1 GB aufklarer/PersonaPlex-7B-MLX-8bit

Voices

18 voice presets available: NATF0-3, NATM0-3, VARF0-4, VARM0-4


Links

Downloads last month
334
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/PersonaPlex-7B-MLX-8bit

Finetuned
(36)
this model

Collection including aufklarer/PersonaPlex-7B-MLX-8bit

Paper for aufklarer/PersonaPlex-7B-MLX-8bit