Kernekoncepter
The authors conducted a detailed replication study of the BASS framework, focusing on challenges in replicating key components and discrepancies in performance compared to the original work.
Resumé
The study replicates the BASS framework for abstractive summarization based on Unified Semantic Graphs. Challenges in replication, discrepancies in performance, and recommendations for writing replicable papers are highlighted. The study emphasizes the importance of clear technical descriptions and self-explanatory details to ensure successful replication.
The authors implemented components like pre-processing, graph construction, text encoder alignment, and model architecture. Discrepancies between replicated methods and original paper were identified. Challenges with missing information, algorithmic complexity, and error-proneness were encountered during replication.
Key findings include differences between USGsrc and USGppr structures, challenges in aligning graph and text embeddings, and issues with graph propagation using PageRank. Recommendations focus on providing clear technical context, notation precision, and commented pseudo-code for better reproducibility.
Statistik
The BigPatent dataset includes 1,204,631 training documents.
The BASS model had 201M parameters.
Training models took about 4 hours on average.
Pre-processing runtime was up to 10 hours per chunk.
Model ended up having approximately 205M trainable parameters.
Citater
"We found some inconsistencies between author information, paper details, and source code."
"Our results indicate poor performance due to model architecture rather than graph structure."
"Challenges included missing third-party information and ambiguity of self-explanatory details."