高级RAG-Cohere Re-Ranker的应用

2023年10月30日 由 alex 发表 434 0

什么是Cohere Re-Ranker?


Cohere是一家加拿大初创公司,提供自然语言处理模型,帮助企业改善人机交互。在这里,我们会在一个检索器中使用Cohere的重新排序终端。这基于ContextualCompressionRetriever中的思想。


什么是上下文压缩?


11


一个与检索相关的挑战是,当我们将数据输入到系统中时,通常不知道文档存储系统将面临的具体查询。


这意味着与查询最相关的信息可能埋藏在一篇有很多无关文本的文档中。将整个文档传递到应用程序中可能会导致更昂贵的LLM调用和更差的响应。


上下文压缩的目的就是解决这个问题。


这个想法很简单:instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned.


这里的"压缩"既指压缩单个文档的内容,也指批量过滤文档。


要使用上下文压缩检索器,你需要:


一个基本的检索器
一个文档压缩器


上下文压缩的步骤


上下文压缩检索器将查询传递给基本的检索器,
然后将初始文档通过文档压缩器处理。
文档压缩器接受文档列表,并通过减少文档的内容或完全删除文档来缩短该列表。


Cohere Re-Ranker 实施


实现堆栈


Cohere:重新排序算法
Openai:大型语言模型
Langchain:支持使用LLM创建应用程序的框架
Faiss-CPU:向量存储


安装所需的依赖项


!pip install -qU langchain 
!pip install -qU openai 
!pip install -qU cohere 
!pip install -qU faiss-cpu 
!pip install -qU tiktoken 
!pip install -qU pypdf 
!pip install -qU sentence_transformers


导入所需的依赖项


#
import os
import openai
from getpass import getpass
#
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
from langchain.document_loaders.pdf import PyPDFDirectoryLoader
from langchain.embeddings import HuggingFaceBgeEmbeddings
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank


安装所需的API密钥


os.environ["COHERE_API_KEY"] = getpass("Cohere API Key:")
os.environ["OPENAI_API_KEY"] = getpass("OpenAI API Key:")


将所需的文件以PDF格式上传到已创建的文件夹中


os.mkdir("Documenation")


定义辅助函数以打印文档


def pretty_print_docs(docs):
    print(
        f"\n{'-' * 100}\n".join(
            [f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]
        )
    )


加载PDF


pdf_folder_path = "/content/Documenation"
loader = PyPDFDirectoryLoader(pdf_folder_path)
docs = loader.load()
#
print(len(docs))
#
print(docs[1].page_content)


################OUTPUT######################
39
Introduction to Management  Studies  
1 Introduction to Management Studies  
 
Alan S. Gutterman  
_______________  
 
§1 Introduction  
 
This Research Paper  introduces  the central and important topic of “manage ment studies”. 
The “study of management” covers a wide array of topics such as organizational t heory 
and behavior, strategic and human resources management, managerial functions and roles 
and identification and training of management skills.  The tools use d by practitioners of 
management studies to collect and analyze information and disseminate fin dings within 
the research community and to practicing managers are similarly diverse.  This Part 
includes a brief  description of the history and evolution of man agement studies, a 
daunting topic given that it is generally recognized that economic and milit ary activities 
have been raising issues of planning, directing and control for thousands of years and that 
one can find useful illustrations of management in the  building of the pyramids in ancient 
Egypt, the operation of the complex trade routes during th e Middle Ages and the 
commercial activities of the wealthy family businesses throughout the Renaissance. Over  
the last few decades hundreds of journals and perio dicals devoted to management studies 
have been launched and management has gone “ mainstream ” as books by authors such as 
Drucker and Peters have rocketed to the top of “best seller” lists.  The rise of 
management education, both at universities and throug h commercial private sector 
initiatives, has been fertile ground for textbooks.1 
 
§2 Definitions  of management  
 
Given that “management” has been so widely studied and practiced for literally 
thousands of years, it is not surprising to find a wide array of possible definitions of the 
term.  At the most basic level, the verb “manage” derives from the I talian word 
“maneggiare”, which is means “to handle”.  A number of definitions of “management” 
have focused on the specific tasks and activities that all manage rs, regardless of whether 
they are overseeing a business, a family or a social group, engage in,  such as planning, 
organizing, directing, coordinating and controlling.  One of the simplest, and often 
quoted, definitions of management was offered by Mary Pa rker Follett, who described it 
as “the art of getting things done through people”.2  The notion of “management through 
people” can also be found in the work of Weihrich and Koontz, who began with a basic 
 
1 There are a number of outstanding and compre hensive textbooks that cover a wide range of subjects 
pertaining to “management”, including G. Jones and J. George, Essentials of Contemporary Management 
(3d Ed) (New York, NY: McGraw -Hill Higher Education, 2009); J. Scott, The Concise Handbook of 
Manageme nt: A Practitioner’s Approach (London: Routledge, 2005); J. Schermerhorn, Management (11th 
Ed) (New York: Wiley, 201 1); R. Griffin, Management (10th Ed) (Boston, MA: South -Western College 
Publishing, 2010); and S. Robbins, M. Coulter and D. DeCenzo, Fundam entals of Management (7th Ed) 
(Upper Saddle River, NJ: Prentice Hall, 2010).  
2 M. Follett, “Dynamic Administration” in H. Metcalf and L. Urwick (Eds.), Dynamic Administration: The 
Collected Papers of Mary Parker Follett (New York: Harper & Row, 1942). 


将文件拆分成较小的块


text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
texts = text_splitter.split_documents(docs)
print(len(texts))
####################OUTPUT###############
368


定义嵌入模型


model_name = "BAAI/bge-small-en-v1.5"
encode_kwargs = {'normalize_embeddings': True} # set True to compute cosine similarity
embeddings = HuggingFaceBgeEmbeddings(
    model_name=model_name,
    model_kwargs={'device': 'cpu'},
    encode_kwargs=encode_kwargs
)


设置基本向量存储检索器


vectorstore = FAISS.from_documents(texts, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 20})


根据查询从向量存储中检索最相关的上下文(不应用重新排序)


query = "According to Kelly and Williams what is ethics?"
docs = retriever.get_relevant_documents(query)
pretty_print_docs(docs)


########################OUTPUT#############################
Document 1:
regarding ethics and values , role model ing, rewards for ethical behavior and swift and sure discipline for 
unethical behavior) and structures and systems that support and reinforce ethical behavior (i.e., 
organizational culture, code of ethics, ethics committee and chief ethics offi ce, ethics t raining and 
procedures for anonymous reporting of ethical concerns (“whistleblowing”)).  
 
Sources:  M. Kelly and C. Williams, “Business Ethics and Social Responsibility”, in M. Kelly and C.
----------------------------------------------------------------------------------------------------
Document 2:
individuals and groups use to analyze or interpret a situation and then decide what is right and the 
appropriate way to behave.  The concept of ethics can be viewed at several levels:  
 
77 Id. at 15 -16.
----------------------------------------------------------------------------------------------------
Document 3:
• The “practical” rule: An ethical decision is one that a manager would have no hesitation 
communicating to others both inside and outside of the organization because they would find it to be
----------------------------------------------------------------------------------------------------
Document 4:
that helps another person or group and which is the “right thing to do” even if the action is not in the 
manager’s own self -interest.  In order for t he manager t o act effectively and appropriately in those 
instances, he or she needs to have a fundamental understanding of ethics and how ethical principles apply 
to managers and their organizations.  
 
According to Kelly and Williams, ethics are the inner -guiding moral  principles, values, and beliefs that
----------------------------------------------------------------------------------------------------
Document 5:
Williams noted that there are no absolute or indisputable ethical rules or principals, but it has been 
suggested the following core values arguably transcend political, religious, class and ethnic differences : 
trustworth iness (i.e., honesty and following through on promises made); respect (i.e., showing 
consideration for others and treating them as you would like to be treated); responsibility (i.e.,
----------------------------------------------------------------------------------------------------
Document 6:
• Societal ethics are standards that govern how the members of a society deal with one another in 
matters that involve issues such  as fairness , justice, poverty and individual rights  
 
Van Auken argued that ethical managers demonstrated certain characteristics including:  
 
• Looking out for the interests of others, including customers, employees and minority members of 
society  
• Valuing e mployees as people as well as workers and giving respect to their family responsibilities,
----------------------------------------------------------------------------------------------------
Document 7:
Introduction to Management  Studies  
36  
• Individual ethics are  personal st andards and values that determine how people view their 
responsibilities to other people and groups and how they should act in situations where their own self -
interest is at stake  
• Occupational ethics are standards that govern how members of a p articular pr ofession, trade or craft 
should conduct themselves when performing their work -related activities
----------------------------------------------------------------------------------------------------
Document 8:
Williams, BUSN: Introduction to Business, Business Ethi cs and Socia l Responsibility (Independence, KY: 
Cengage Learning, 2015); P. Van Auken, http://business.baylor.edu/Phil_VanAuken/EthiclSupvr.html ; 
Josephson Institute’s 2009 Report Ca rd on the Et hics of American Youth Summary (“Universal Ethical 
Standards”); and L. Trevin, L. Harman and M. Brown, “Moral Person and Moral Manager”, California 
Management Review, 42(4) (Summer 2000), 128.
----------------------------------------------------------------------------------------------------
Document 9:
Introduction to Management  Studies  
35 and that they have access to all of the organizational intellectual capital that they need in 
order to successfully carry out their activities.  
 
§26 --Ethics  
 
Ethical behavior and social responsibility have become two important to pics for 
managers, particularly in light of the scandals and difficult economic times that have 
made their marks on the first decade of the new century.77  Government regulations such
----------------------------------------------------------------------------------------------------
Document 10:
consideration for others and treating them as you would like to be treated); responsibility (i.e., 
perseverance, self -discipline and personal accountability); fairness (i. e., providing equal opportunities and 
being open -minded); caring (i.e., kindness and compassion); and citizenship (i.e., cooperation and 
willingness to contribute to the broader community).  
 
Effective managers understand the beliefs and behavio rs of ethica l individuals and attempt to practice them
----------------------------------------------------------------------------------------------------
Document 11:
should conduct themselves when performing their work -related activities  
• Organizational ethics are the guiding principles through which an organization and its managers view 
their duties and responsibilities to the organ ization’s st akeholders (e.g., owners, managers, employees, 
suppliers and distributors, customers and the surrounding community)  
• Societal ethics are standards that govern how the members of a society deal with one another in
----------------------------------------------------------------------------------------------------
Document 12:
rights and privileges of the people affected by it, which means managers must take into account the 
effectiv e of each alternative decision on the rights of each affected stakeholder group.  
• The “justice” rule: An ethical decision is one that distributes both the benefits and the harms among the 
organizational stakeholders in a fair, equitable or impar tial manner  
• The “practical” rule: An ethical decision is one that a manager would have no hesitation
----------------------------------------------------------------------------------------------------
Document 13:
as they engaged in their managerial roles and activities.  Trevino et al. suggested that this means managing 
with integrity and honesty, inspiring trust from subordinates, treating people the right way  and playing  
fairly and striving for a high level of moral development.  In addition, managers must do what they can to 
create and maintain an ethical organization that is based on ethical leadership (i.e., leader communications
----------------------------------------------------------------------------------------------------
Document 14:
strive to find ways to deliver as much as possible to others  
• The “holism” principle: Remember to keep the “big picture” in mind at all times and recognize the 
importance of the personal side of employees in addition to their professional  activities,  the service 
side of business along with the profit side and the needs of the minority as well as the majority  
 
Kelly and Williams also offered ethical rules and principles that managers could use to analyze the impact
----------------------------------------------------------------------------------------------------
Document 15:
of their decisions on org anizational stakeholders:  
 
• The “utilitarian” rule:  An ethical decision is one that produces the greatest good for the greatest 
number of people, which means that managers should compare alternative courses of action based on 
the benefits and costs of each  alternative  for different organizational stakeholders  
• The “moral rights” rule: An ethical decision is the one that best maintains and protects the fundamental
----------------------------------------------------------------------------------------------------
Document 16:
Introduction to Management  Studies  
37 reasonable and acceptable (i.e., consistent with value s, norms and  standards typically acknowledged 
and applied within the organization)  
 
Legal and ethical principles are not necessarily the same; however, laws generally reflect the ethical norms 
at a particular time.  Ethical principles are also subject to c hange over t ime as societies evolve.  Kelly and
----------------------------------------------------------------------------------------------------
Document 17:
community involvement and religious beliefs  
• Not deceiving people and simply telling them what they want to hear rather than the truth  
• Not playing psychological games  with others , such as blame -shifting, practicing one -upmanship or 
playing favorites  
• Valuing people over pragmatism and recognizing that how things are achieved is just as important as 
what is achieved.  
• Focusing on the ultimate objective or mission (the “en ds”) more th an rules and regulations (the 
“means”)
----------------------------------------------------------------------------------------------------
Document 18:
work standards and practices could be discovered by  experimentation and 
observation. From this, it follows, that there is "one right way" for work to be 
performed.  
2. The selection of workers is a science. Taylor's "first class worker" was 
someone suitable for the job. It was management's role to determine  the kind of 
work for which an employee was most suited, and to hire and assign workers 
accordin gly.
----------------------------------------------------------------------------------------------------
Document 19:
corpor ations”, and there viability will depend in large part on the development of case 
law regarding the permissi ble purposes of such corporations and the flexibility afforded 
to directors in discharging their fiduciary duties.  
 
The Ethical Manager  
 
When carrying out their duties and responsibilities managers may often find themselves confronted with an 
“ethical dil emma”, which a situation in the manager must decide whether to take a certain course of action
----------------------------------------------------------------------------------------------------
Document 20:
“means”)  
• A commitment to ideals beyond self, such as honesty, fair play, and quality work  
 
Van Auken went on to recommend that managers understand and adhere to several guiding ethical 
principles when engaging in s upervisory b ehavior:  
 
• The “mission” principle: Stick to the basic mission of the organization (e.g., service, quality, value to 
the customer) as the day -to-day guide to supervisory behavior and decision making


RAG 管道的“生成”部分


#
llm = ChatOpenAI(model_name="gpt-3.5-turbo-16k",temperature=0.1)
#
qa = RetrievalQA.from_chain_type(llm=llm,
                                 chain_type="stuff",
                                 retriever=retriever)


%%time
#
print(qa.run(query=query))


###########################OUTPUT###########################
According to Kelly and Williams, ethics are the inner-guiding moral principles, values, and beliefs that individuals and groups use to analyze or interpret a situation and then decide what is right and the appropriate way to behave.
CPU times: user 249 ms, sys: 2.88 ms, total: 252 ms
Wall time: 3.15 s


应用CohereRerank进行重新排序


compressor = CohereRerank()
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)
#
compressed_docs = compression_retriever.get_relevant_documents(query)
pretty_print_docs(compressed_docs)


##########################OUTPUT##########################
Document 1:
that helps another person or group and which is the “right thing to do” even if the action is not in the 
manager’s own self -interest.  In order for t he manager t o act effectively and appropriately in those 
instances, he or she needs to have a fundamental understanding of ethics and how ethical principles apply 
to managers and their organizations.  
 
According to Kelly and Williams, ethics are the inner -guiding moral  principles, values, and beliefs that
----------------------------------------------------------------------------------------------------
Document 2:
strive to find ways to deliver as much as possible to others  
• The “holism” principle: Remember to keep the “big picture” in mind at all times and recognize the 
importance of the personal side of employees in addition to their professional  activities,  the service 
side of business along with the profit side and the needs of the minority as well as the majority  
 
Kelly and Williams also offered ethical rules and principles that managers could use to analyze the impact
----------------------------------------------------------------------------------------------------
Document 3:
regarding ethics and values , role model ing, rewards for ethical behavior and swift and sure discipline for 
unethical behavior) and structures and systems that support and reinforce ethical behavior (i.e., 
organizational culture, code of ethics, ethics committee and chief ethics offi ce, ethics t raining and 
procedures for anonymous reporting of ethical concerns (“whistleblowing”)).  
 
Sources:  M. Kelly and C. Williams, “Business Ethics and Social Responsibility”, in M. Kelly and C.


如果我们看一下压缩机检索器,它仅根据查询从基础检索器返回的所有上下文中返回了最相关的3个上下文。


生成- RAG管道使用压缩机检索器


qa = RetrievalQA.from_chain_type(llm=llm,
                                 chain_type="stuff",
                                 retriever=compression_retriever )



%%time
#
print(qa.run(query=query))



##################################OUTPUT##########################
According to Kelly and Williams, ethics are the inner-guiding moral principles, values, and beliefs that strive to find ways to deliver as much as possible to others.
CPU times: user 246 ms, sys: 3.9 ms, total: 250 ms
Wall time: 2.42 s


注:从这里我们可以看到,在不应用重新排名时,响应生成的处理时间已经从3.15秒减少到2.42秒。


结论


使用重新排名,我们可以通过重新排列与所提出的查询更好地相配的上下文,以特定标准考虑,提高语言模型(LLM)生成的响应的质量。这个过程还确保LLM获得更相关的上下文,最终减少生成响应所需的时间,提高其质量。

文章来源:https://medium.com/dphi-tech/advanced-rag-cohere-re-ranker-99acc941601c
欢迎关注ATYUN官方公众号
商务合作及内容投稿请联系邮箱:bd@atyun.com
评论 登录
热门职位
Maluuba
20000~40000/月
Cisco
25000~30000/月 深圳市
PilotAILabs
30000~60000/年 深圳市
写评论取消
回复取消