Ulysses Sequence Parallelism: Training with Million-Token Contexts
Posted 5 hours ago by
ibobev
1
points
https://huggingface.co/blog/ulysses-sp
0
comments