import os os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
from llama_index import VectorStoreIndex, SimpleDirectoryReader from llama_index.postprocessor.flag_embedding_reranker import FlagEmbeddingReranker from llama_index.schema import QueryBundle
query = "Can you provide a concise description of the TinyLlama model?" nodes = retriever.retrieve(query) for node in nodes: print('----------------------------------------------------') display_source_node(node, source_length = 500)
print(text_md) # display(Markdown(text_md)) # if isinstance(source_node.node, ImageNode) and source_node.node.image is not None: # display_image(source_node.node.image)
(llamaindex_new) Florian:~ Florian$ python /Users/Florian/Documents/rerank.py ---------------------------------------------------- Node ID: 20de8234-a668-442d-8495-d39b156b44bb Score: 0.8703492815379594 Text: 4 Conclusion In this paper, we introduce TinyLlama, an open-source, small-scale language model. To promote transparency in the open-source LLM pre-training community, we have released all relevant infor- mation, including our pre-training code, all intermediate model checkpoints, and the details of our data processing steps. With its compact architecture and promising performance, TinyLlama can enable end-user applications on mobile devices, and serve as a lightweight platform for testing a w...
---------------------------------------------------- Node ID: 47ba3955-c6f8-4f28-a3db-f3222b3a09cd Score: 0.8621633467539512 Text: TinyLlama: An Open-Source Small Language Model Peiyuan Zhang∗Guangtao Zeng∗Tianduo Wang Wei Lu StatNLP Research Group Singapore University of Technology and Design {peiyuan_zhang, tianduo_wang, @sutd.edu.sg">luwei}@sutd.edu.sg [email protected] Abstract We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tok- enizer of Llama 2 (Touvron et al., 2023b), TinyLlama leverages various advances contr... ---------------------------------------------------- Node ID: 17cd9896-473c-47e0-8419-16b4ac615a59 Score: 0.8343984516104476 Text: Although these works show a clear preference on large models, the potential of training smaller models with larger dataset remains under-explored. Instead of training compute-optimal language models, Touvron et al. (2023a) highlight the importance of the inference budget, instead of focusing solely on training compute-optimal language models. Inference-optimal language models aim for optimal performance within specific inference constraints This is achieved by training models with more tokens... ------------------------------------------------------------------------------------------------ Start reranking... ---------------------------------------------------- Node ID: 47ba3955-c6f8-4f28-a3db-f3222b3a09cd Score: 0.8621633467539512 Text: TinyLlama: An Open-Source Small Language Model Peiyuan Zhang∗Guangtao Zeng∗Tianduo Wang Wei Lu StatNLP Research Group Singapore University of Technology and Design {peiyuan_zhang, tianduo_wang, @sutd.edu.sg">luwei}@sutd.edu.sg [email protected] Abstract We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tok- enizer of Llama 2 (Touvron et al., 2023b), TinyLlama leverages various advances contr...
---------------------------------------------------- Node ID: 17cd9896-473c-47e0-8419-16b4ac615a59 Score: 0.8343984516104476 Text: Although these works show a clear preference on large models, the potential of training smaller models with larger dataset remains under-explored. Instead of training compute-optimal language models, Touvron et al. (2023a) highlight the importance of the inference budget, instead of focusing solely on training compute-optimal language models. Inference-optimal language models aim for optimal performance within specific inference constraints This is achieved by training models with more tokens...
---------------------------------------------------- Node ID: 20de8234-a668-442d-8495-d39b156b44bb Score: 0.8703492815379594 Text: 4 Conclusion In this paper, we introduce TinyLlama, an open-source, small-scale language model. To promote transparency in the open-source LLM pre-training community, we have released all relevant infor- mation, including our pre-training code, all intermediate model checkpoints, and the details of our data processing steps. With its compact architecture and promising performance, TinyLlama can enable end-user applications on mobile devices, and serve as a lightweight platform for testing a w...