- 
                Notifications
    You must be signed in to change notification settings 
- Fork 251
Pull requests: huggingface/nanotron
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      Removed assertion for s3 datasets and handled string and object cases
      
    
        
          #381
            opened Jul 3, 2025  by
            SulRash
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 6 tasks
  
      Fixed nanoset data stage handling during pretraining
      
    
        
          #380
            opened Jul 3, 2025  by
            SulRash
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 6 tasks
  
      Fix issue while running tiny llama script on ADA 4000 gpu
      
    
      
  
        
          #379
            opened Jul 2, 2025  by
            chetandhembre
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 6 tasks
  
      Extra name argument to select configuration of hf dataset
      
    
        
          #378
            opened Jun 30, 2025  by
            SulRash
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    1 of 6 tasks
  
      [feature] Add debug_dataloader_samples utility to preview decoded dataloader samples (#184)
      
    
        
          #368
            opened May 26, 2025  by
            garongkim
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    6 tasks
  
      [Feature] Hide 75% of the communication in tensor parallelism using DoMiNo
      
    
      
  
        
          #292
            opened Mar 10, 2025  by
            xrsrke
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Fix unpacking issue caused by newer Flash Attention
      
    
      
  
        
          #289
            opened Mar 5, 2025  by
            Stillerman
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    3 of 6 tasks
  
      [Feature] Over 99% communication overlap in Tensor Parallelism using Domino
      
    
      
  
        
          #286
            opened Mar 1, 2025  by
            hwchen2017
            
        
        
            
    
  
    Loading…
 
        
        
      
    Previous Next
  
  
  ProTip!
  Updated in the last three days: updated:>2025-10-27.