🔧 阿川の電商水電行

Shopify 顧問、維護與客製化

💡

小任務 / 單次支援方案

單次處理 Shopify 修正／微調

⭐️

維護方案

每月 Shopify 技術支援 + 小修改 + 諮詢

🚀

專案建置

Shopify 功能導入、培訓 + 分階段交付

👉 瞭解詳情 / 免費諮詢

小編精選 - 技術文章翻譯 · 08月29日

給 AI 工程師的 15 個開源專案🧙‍♂️ 🪄

人工智慧風靡一時，並且有大量的炒作。有人說這將改變我們所知道的世界（以錯誤的方式），而其他人則說這是一種時尚。

然而，正如埃隆馬斯克所說，“最有趣的結果是最有可能的。”

人工智慧不會殺死我們所有人，它也不是一種時尚。相反，它將提高我們的生產力，建立更複雜的系統。

愛圖工作 gif

為了讓這些事情發生，我們現在能做的最好的事情就是加快我們的方式的進展，而人工智慧工程師將在其中發揮至關重要的作用。

人工智慧工程是一個跨學科領域，包括

訓練和微調模型。
收集、清理和預處理資料。
建立用於複雜任務自動化的 AI 系統（AI 代理、RAG 等）。
將模型安全地部署到生產中。
持續監控和評估人工智慧系統。

我一直圍繞著它進行了大量的建置和研究。因此，我準備了一份開源軟體清單，可以讓你成為更好的人工智慧工程師。

令人驚嘆的

點擊下面的表情符號即可存取相應的部分。 👇

Composio：建立 AI 自動化速度提高 10 倍。 🚀
Unsloth：人工智慧模型的更快訓練和微調。 🦥💨
DsPy：LLMs程式設計框架。 🛠️
LLMware：用於建構企業 RAG 的框架。 🏢
TaiPy：使用 Python 更快建立 AI Web 應用程式。 🐍💻
LanceDB：AI 應用程式的向量知識庫。 📚
Phidata：使用記憶體建立 LLM 代理。 🧠
Phoenix：LLMs可觀察性變得有效率。 🔥
Airbyte：可靠且可擴展的資料管道。 🌬️
AgentOps：代理監控和可觀察性。 👁️
RAGAS：RAG 評估架構。 📊
BentoML：服務人工智慧應用和模型的最簡單方法。 🍱
LoRAX：多 LoRA 推理伺服器，可擴充至 1000 個經過微調的 LLM。 📡
網關：使用單一 API 可靠地路由至 200 個LLMs。 🌐
LitServe：用於 AI 模型的靈活、高吞吐量服務引擎。 💫

請隨意探索存儲庫並為存儲庫做出貢獻。

Composition 👑：建構 AI 自動化速度提高 10 倍🚀

工具和整合構成了建構人工智慧代理的核心。

我一直在建立 AI 工具和代理，但工具的準確性一直是一個問題，直到我遇到 Composio。

Composio 可以輕鬆整合 GitHub、Slack、Jira、Airtable 等流行應用程式，並更輕鬆地與 AI 代理整合以建立複雜的自動化。

它代表您的使用者處理整合的使用者身份驗證和授權。這樣您就可以安心地建立您的 AI 應用程式。並且它通過了 SOC2 認證。

那麼，您可以按照以下方法開始使用它。

Python

pip install composio-core

新增 GitHub 整合。

composio add github

Composio 代表您處理使用者身份驗證和授權。

以下是如何使用 GitHub 整合來為儲存庫加註星標。

from openai import OpenAI
from composio_openai import ComposioToolSet, App

openai_client = OpenAI(api_key="******OPENAIKEY******")

# Initialise the Composio Tool Set
composio_toolset = ComposioToolSet(api_key="**\\*\\***COMPOSIO_API_KEY**\\*\\***")

## Step 4
# Get GitHub tools that are pre-configured
actions = composio_toolset.get_actions(actions=[Action.GITHUB_ACTIVITY_STAR_REPO_FOR_AUTHENTICATED_USER])

## Step 5
my_task = "Star a repo ComposioHQ/composio on GitHub"

# Create a chat completion request to decide on the action
response = openai_client.chat.completions.create(
model="gpt-4-turbo",
tools=actions, # Passing actions we fetched earlier.
messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": my_task}
  ]
)

執行此 Python 腳本以使用代理程式執行給定的指令。

JavaScript

您可以使用npm 、 yarn或pnpm安裝它。

npm install composio-core

定義一個方法來讓使用者連接他們的 GitHub 帳戶。

import { OpenAI } from "openai";
import { OpenAIToolSet } from "composio-core";

const toolset = new OpenAIToolSet({
  apiKey: process.env.COMPOSIO_API_KEY,
});

async function setupUserConnectionIfNotExists(entityId) {
  const entity = await toolset.client.getEntity(entityId);
  const connection = await entity.getConnection('github');

  if (!connection) {
      // If this entity/user hasn't already connected, the account
      const connection = await entity.initiateConnection(appName);
      console.log("Log in via: ", connection.redirectUrl);
      return connection.waitUntilActive(60);
  }

  return connection;
}

將所需的工具加入 OpenAI SDK 並將實體名稱傳遞給executeAgent函數。

async function executeAgent(entityName) {
  const entity = await toolset.client.getEntity(entityName)
  await setupUserConnectionIfNotExists(entity.id);

  const tools = await toolset.get_actions({ actions: ["github_activity_star_repo_for_authenticated_user"] }, entity.id);
  const instruction = "Star a repo ComposioHQ/composio on GitHub"

  const client = new OpenAI({ apiKey: process.env.OPEN_AI_API_KEY })
  const response = await client.chat.completions.create({
      model: "gpt-4-turbo",
      messages: [{
          role: "user",
          content: instruction,
      }],
      tools: tools,
      tool_choice: "auto",
  })

  console.log(response.choices[0].message.tool_calls);
  await toolset.handle_tool_call(response, entity.id);
}

executeGithubAgent("joey")

執行程式碼並讓代理程式為您完成工作。

Composio 可與 LangChain、LlamaIndex、CrewAi 等著名框架搭配使用。

有關更多訊息，請存取官方文件，有關更複雜的示例，請參閱存儲庫的示例部分。

gif 組合

https://dub.composio.dev/aigems 為 Composio 儲存庫加註星標 ⭐

Unsloth：更快訓練和微調 AI 模型🦥💨

訓練和微調大型語言模型 (LLM) 是人工智慧工程的重要組成部分。

在許多情況下，專有模型可能無法達到目的。它可能是成本、個性化或隱私。在某些時候，您將需要在自訂資料集上微調您的模型。目前，Unsloth 是微調和培訓 LLM 的最佳資料庫之一。

它支援流行的LLM 的完整、LoRA 和QLoRA 微調，包括Llama-3 和Mistral 及其衍生品，如Yi、Open-hermes 等。 3 和Mistral 等的速度。

要開始使用 Unsloth，請使用pip安裝它，並確保您有torch 2.4和CUDA 12.1 。

pip install --upgrade pip
pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git"

這是一個使用 SFT（監督微調）在資料集上訓練 Mistral 模型的簡單腳本

from unsloth import FastLanguageModel 
from unsloth import is_bfloat16_supported
import torch
from trl import SFTTrainer
from transformers import TrainingArguments
from datasets import load_dataset
max_seq_length = 2048 # Supports RoPE Scaling internally, so choose any!
# Get LAION dataset
url = "https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl"
dataset = load_dataset("json", data_files = {"train" : url}, split = "train")

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/mistral-7b-v0.3-bnb-4bit",      # New Mistral v3 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = None,
    load_in_4bit = True,
)

# Do model patching and add fast LoRA weights
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    max_seq_length = max_seq_length,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

trainer = SFTTrainer(
    model = model,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    tokenizer = tokenizer,
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 10,
        max_steps = 60,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        output_dir = "outputs",
        optim = "adamw_8bit",
        seed = 3407,
    ),
)
trainer.train()

更多資訊請參考官方文件。

不懶惰 gif

https://github.com/unslothai/unsloth 為 Unsloth 儲存庫加註星標 ⭐

DsPy：LLM 程式框架🛠️

阻礙 LLM 在生產用例中使用的因素之一是其隨機性。對於這些用例來說，提示他們輸出所需的回應的失敗率很高。

DsPy 正在解決這個問題。它不是提示，而是對LLMs進行編程以獲得最大的可靠性。

DSPy 透過執行兩項關鍵操作來簡化此操作：

將程式流程與參數分開：此功能可使程式流程（您採取的步驟）與每個步驟如何完成的詳細資訊（LM 提示和權重）分開。這使得管理和更新您的系統變得更加容易。
引入新的優化器： DSPy 使用先進的演算法，根據您的目標（例如提高準確性或減少錯誤）自動微調 LM 提示和權重。

請查看此入門筆記本，以了解有關如何使用 DsPy 的更多資訊。

dspy gif

https://github.com/stanfordnlp/dspy 為 DsPy 儲存庫加註星標 ⭐

LLMware：建構企業 RAG 的架構🏢

開發企業軟體時，隱私、安全性和可靠性至關重要。如果您正在尋找建立企業人工智慧應用程式的框架，LLMWare 是您的首選。

它們提供了一個統一的框架，用於使用微型專用模型建立基於LLM 的應用程式（例如RAG、代理），這些模型可以私下部署、安全可靠地與企業知識來源集成，並針對任何業務流程進行經濟有效的調整和調整。

LLMWare 有兩個主要元件：

RAG Pipeline - 將知識源連接到產生人工智慧模型的整合元件。
50 多個小型專用模型針對企業流程自動化中的關鍵任務進行了微調，包括基於事實的問答、分類、摘要和提取。

使用pip設定 LLMware

pip3 install llmware

以下是如何建立和使用資料集。


""" This example demonstrates creating and using datasets
    1. Datasets suitable for fine-tuning embedding models
    2. Completion and other types of datasets
    3. Generating datasets from all data in a library or with filtered data
    4. Creating datasets from AWS Transcribe transcripts
"""

import json
import os
from llmware.dataset_tools import Datasets
from llmware.library import Library
from llmware.retrieval import Query
from llmware.setup import Setup
from llmware.configs import LLMWareConfig

def build_and_use_dataset(library_name):

    # Setup a library and build a knowledge graph.  Datasets will use the data in the knowledge graph
    print (f"\n > Creating library {library_name}...")
    library = Library().create_new_library(library_name)
    sample_files_path = Setup().load_sample_files()
    library.add_files(os.path.join(sample_files_path,"SmallLibrary"))
    library.generate_knowledge_graph()

    # Create a Datasets object from library
    datasets = Datasets(library)

    # Build a basic dataset useful for industry domain adaptation for fine-tuning embedding models
    print (f"\n > Building basic text dataset...")

    basic_embedding_dataset = datasets.build_text_ds(min_tokens=500, max_tokens=1000)
    dataset_location = os.path.join(library.dataset_path, basic_embedding_dataset["ds_id"])

    print (f"\n > Dataset:")
    print (f"(Files referenced below are found in {dataset_location})")

    print (f"\n{json.dumps(basic_embedding_dataset, indent=2)}")
    sample = datasets.get_dataset_sample(datasets.current_ds_name)

    print (f"\nRandom sample from the dataset:\n{json.dumps(sample, indent=2)}")

    # Other Dataset Generation and Usage Examples:

    # Build a simple self-supervised generative dataset- extracts text and splits into 'text' & 'completion'
    # Several generative "prompt_wrappers" are available - chat_gpt | alpaca | 
    basic_generative_completion_dataset = datasets.build_gen_ds_targeted_text_completion(prompt_wrapper="alpaca")

    # Build a generative self-supervised training set by pairing 'header_text' with 'text.'
    xsum_generative_completion_dataset = datasets.build_gen_ds_headline_text_xsum(prompt_wrapper="human_bot")
    topic_prompter_dataset = datasets.build_gen_ds_headline_topic_prompter(prompt_wrapper="chat_gpt")

    # Filter a library by a key term as part of building the dataset
    filtered_dataset = datasets.build_text_ds(query="agreement", filter_dict={"master_index":1})

    # Pass a set of query results to create a dataset from those results only
    query_results = Query(library=library).query("africa")
    query_filtered_dataset = datasets.build_text_ds(min_tokens=250,max_tokens=600, qr=query_results)

    return 0

if __name__ == "__main__":

    LLMWareConfig().set_active_db("sqlite")

    build_and_use_dataset("test_txt_datasets_0")

探索如何使用 LLMWare 的範例。有關更多訊息，請參閱文件。

llmware gif

https://github.com/llmware-ai/llmware 為 LLMWare 儲存庫加註星標 ⭐

TaiPy：使用 Python 更快建立 AI Web 應用程式。 🐍💻

Taipy 是一款基於 Python 的開源軟體，專為在生產環境中建立 AI Web 應用程式而設計。它使 Python 開發人員能夠在生產中部署演示應用程式，從而增強了 Streamlit 和 Gradio。

Taipy 專為資料科學家和機器學習工程師建立資料和人工智慧 Web 應用程式而設計。

能夠建立生產就緒的 Web 應用程式
無需學習新語言。只需要Python。
專注於資料和人工智慧演算法，無需開發和部署複雜性。

使用pip快速開始使用它。

pip install taipy

這個簡單的 Taipy 應用程式示範如何使用 Taipy 建立基本的電影推薦系統。

import taipy as tp
import pandas as pd
from taipy import Config, Scope, Gui

# Defining the helper functions

# Callback definition - submits scenario with genre selection
def on_genre_selected(state):
    scenario.selected_genre_node.write(state.selected_genre)
    tp.submit(scenario)
    state.df = scenario.filtered_data.read()

## Set initial value to Action
def on_init(state):
    on_genre_selected(state)

# Filtering function - task
def filter_genre(initial_dataset: pd.DataFrame, selected_genre):
    filtered_dataset = initial_dataset[initial_dataset["genres"].str.contains(selected_genre)]
    filtered_data = filtered_dataset.nlargest(7, "Popularity %")
    return filtered_data

# The main script
if __name__ == "__main__":
    # Taipy Scenario & Data Management

    # Load the configuration made with Taipy Studio
    Config.load("config.toml")
    scenario_cfg = Config.scenarios["scenario"]

    # Start Taipy Core service
    tp.Core().run()

    # Create a scenario
    scenario = tp.create_scenario(scenario_cfg)

    # Taipy User Interface
    # Let's add a GUI to our Scenario Management for a complete application

    # Get list of genres
    genres = [
        "Action", "Adventure", "Animation", "Children", "Comedy", "Fantasy", "IMAX"
        "Romance","Sci-FI", "Western", "Crime", "Mystery", "Drama", "Horror", "Thriller", "Film-Noir","War", "Musical", "Documentary"
    ]

    # Initialization of variables
    df = pd.DataFrame(columns=["Title", "Popularity %"])
    selected_genre = "Action"

    # User interface definition
    my_page = """
# Film recommendation

## Choose your favorite genre
<|{selected_genre}|selector|lov={genres}|on_change=on_genre_selected|dropdown|>

## Here are the top seven picks by popularity
<|{df}|chart|x=Title|y=Popularity %|type=bar|title=Film Popularity|>
    """

    Gui(page=my_page).run()

查看文件以了解更多資訊。

太比 gif

https://github.com/avaiga/taipy Star Taipy 儲存庫 ⭐

LanceDB：AI應用程式的向量知識庫。 📚

如果您正在建立人工智慧應用程式，您將需要一個向量資料庫來儲存和檢索文字、圖像和影片等結構化資料。與傳統資料庫不同，向量資料庫儲存這些資料的嵌入。

嵌入是資料的高維數值表示。向量資料庫使用相似度分數等方法來檢索相關資料。

LanceDb 是一個用 Typescript 寫的開源向量資料庫。它提供生產規模的向量搜尋、多模式支援、零複製、自動資料版本控制、GPU 驅動的查詢等。

開始使用 LanceDB。

npm install @lancedb/lancedb

建立和查詢向量資料庫。

import * as lancedb from "@lancedb/lancedb";

const db = await lancedb.connect("data/sample-lancedb");
const table = await db.createTable("vectors", [
    { id: 1, vector: [0.1, 0.2], item: "foo", price: 10 },
    { id: 2, vector: [1.1, 1.2], item: "bar", price: 50 },
], {mode: 'overwrite'});

const query = table.vectorSearch([0.1, 0.3]).limit(2);
const results = await query.toArray();

// You can also search for rows by specific criteria without involving a vector search.
const rowsByCriteria = await table.query().where("price >= 10").toArray();

您可以在其文件中找到有關 LanceDB 的更多資訊。

蘭斯資料庫 gif

https://github.com/lancedb/lancedb 明星 LanceDB 儲存庫 ⭐

Phidata：使用記憶體建構LLM代理。 🧠

通常，建立有效的代理可能並不像聽起來那麼容易。管理記憶體、快取和工具執行可能變得具有挑戰性。

Phidata 是一個開源框架，它提供了一種方便可靠的方法來建立具有長期記憶、上下文知識以及使用函數呼叫採取行動的能力的代理。

透過pip安裝開始使用 Phidata

pip install -U phidata

我們來建立一個簡單的助手，可以查詢財務資料。

from phi.assistant import Assistant
from phi.llm.openai import OpenAIChat
from phi.tools.yfinance import YFinanceTools

assistant = Assistant(
    llm=OpenAIChat(model="gpt-4o"),
    tools=[YFinanceTools(stock_price=True, analyst_recommendations=True, company_info=True, company_news=True)],
    show_tool_calls=True,
    markdown=True,
)
assistant.print_response("What is the stock price of NVDA")
assistant.print_response("Write a comparison between NVDA and AMD, use all tools available.")

一個可以上網的助手。

from phi.assistant import Assistant
from phi.tools.duckduckgo import DuckDuckGo

assistant = Assistant(tools=[DuckDuckGo()], show_tool_calls=True)
assistant.print_response("Whats happening in France?", markdown=True)

請參閱官方文件以取得範例和資訊。

菲資料 gif

https://github.com/phidatahq/phidata 明星 Phidata 儲存庫 ⭐

Phoenix：LLMs可觀察性變得有效率。 🔥

建立人工智慧應用程式只能透過加入可觀察層來完成。通常，LLM 會申請有許多變化的部分，例如提示、模型溫度、p 值等，即使發生微小的變化，也會對結果產生重大影響。

這會使應用程式高度不穩定且不可靠。這就是LLMs可觀察性發揮作用的地方。 ArizeAI 的 Phoneix 可以輕鬆追蹤 LLM 執行的整個追蹤。

它是一個開源人工智慧可觀察平台，專為實驗、評估和故障排除而設計。它提供：

追蹤- 使用基於 OpenTelemetry 的儀器追蹤 LLM 應用程式的執行時。
評估- 利用LLMs透過回應和檢索評估來對應用程式的性能進行基準測試。
資料集- 建立範例的版本化資料集以進行實驗、評估和微調。
實驗- 追蹤和評估提示、LLM 和檢索更改。

Phoenix 與供應商和語言無關，支援 LlamaIndex、LangChain、DSPy 等框架以及 OpenAI 和 Bedrock 等 LLM 供應商。

它可以在各種環境中執行，包括 Jupyter 筆記本、本機電腦、容器或雲端。

Phoneix 上手很簡單。

pip install arize-phoenix

首先，啟動 Phoenix 應用程式。

import phoenix as px
session = px.launch_app()

這將啟動 Phoneix 伺服器。

現在，您可以為 AI 應用程式設定跟踪，以便在跟踪流入時除錯您的應用程式。

若要使用 LlamaIndex 的一鍵功能，您必須先安裝小型整合：

pip install 'llama-index>=0.10.44'

import phoenix as px
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
import os
from gcsfs import GCSFileSystem
from llama_index.core import (
    Settings,
    VectorStoreIndex,
    StorageContext,
    set_global_handler,
    load_index_from_storage
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import llama_index

# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:
session = px.launch_app()

# Initialize LlamaIndex auto-instrumentation
LlamaIndexInstrumentor().instrument()

os.environ["OPENAI_API_KEY"] = "<ENTER_YOUR_OPENAI_API_KEY_HERE>"

# LlamaIndex application initialization may vary
# depending on your application
Settings.llm = OpenAI(model="gpt-4-turbo-preview")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

# Load your data and create an index. Here we've provided an example of our documentation
file_system = GCSFileSystem(project="public-assets-275721")
index_path = "arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/"
storage_context = StorageContext.from_defaults(
    fs=file_system,
    persist_dir=index_path,
)

index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Query your LlamaIndex application
query_engine.query("What is the meaning of life?")
query_engine.query("How can I deploy Arize?")

# View the traces in the Phoenix UI
px.active_session().url

一旦您為應用程式執行了足夠數量的查詢（或聊天），您就可以透過刷新瀏覽器 URL 來查看 UI 的詳細資訊。

請參閱他們的文件以獲取更多追蹤、資料集版本控制和評估範例。

鳳凰 gif

https://github.com/Arize-ai/phoenix Star Phoenix 儲存庫 ⭐

Airbyte：可靠且可擴充的資料管道。 🌬️

資料對於建立人工智慧應用至關重要，尤其是在生產中，您必須管理來自不同來源的大量資料。 Airbyte 在這方面表現出色。

Airbyte 為 API、資料庫、資料倉儲和資料湖提供了包含 300 多個連接器的廣泛目錄。

Airbyte 還具有一個名為 PyAirByte 的 Python 擴充功能。此擴充功能支援 LangChain 和 LlamaIndex 等流行框架，讓您可以輕鬆地將資料從多個來源移至 GenAI 應用程式。

請參閱此筆記本，以了解有關使用 LangChain 實現 PyAirByte 的詳細資訊。

有關更多訊息，請查看文件。

空中位元組 gif

https://github.com/airbytehq/airbyte Star AirByte 儲存庫 ⭐

AgentOps：代理監控和可觀察性。 👁️

就像傳統的軟體系統一樣，人工智慧代理需要持續的監控和觀察。這對於確保代理人的行為不偏離期望非常重要。

AgentOps 提供了監控和觀察 AI 代理程式的全面解決方案。

它提供用於回放分析、LLM 成本管理、代理基準測試、合規性和安全性的工具，並與 CrewAI、AutoGen 和 LangChain 等框架進行本地整合。

透過pip安裝 AgentOps 來開始使用它。

pip install agentops

初始化 AgentOps 用戶端並自動取得每個 LLM 呼叫的分析。

import agentops

# Beginning of program's code (i.e. main.py, __init__.py)
agentops.init( < INSERT YOUR API KEY HERE >)

...

# (optional: record specific functions)
@agentops.record_action('sample function being record')
def sample_function(...):
    ...

# End of program
agentops.end_session('Success')
# Woohoo You're done 🎉

有關更多訊息，請參閱他們的文件。

代理 gif

https://github.com/AgentOps-AI/agentops 明星 AgentOps 儲存庫 ⭐

RAGAS：RAG 評估架構。 📊

建立 RAG 管道具有挑戰性，但確定其在現實場景中的有效性則是另一回事。儘管 RAG 應用程式框架取得了進步，但確保真實用戶的可靠性仍然具有挑戰性，特別是當錯誤檢索的成本很高時。

RAGAS是一個旨在解決這個問題的框架。它可以幫助您評估檢索增強生成 (RAG) 管道。

它可以幫助您產生綜合測試集，針對它們測試您的 RAG 管道，並在生產中監控您的 RAG 應用程式。

查看文件以了解如何使用 RAGAS 改進新的和現有的 RAG 管道。

拉加斯 gif

https://github.com/explodinggradients/ragas Star RAGAS 儲存庫 ⭐

BentoML：服務人工智慧應用和模型的最簡單方法。 🍱

BentoML 是開源軟體，提供了一種在生產中服務模式和 AI 應用程式的便捷方式。無論是傳統的機器學習模型還是語言模型，只需幾行程式碼和標準的Python類型提示，它就可以將任何模型推理腳本變成REST API伺服器。

它提供模型服務最佳化功能，例如動態批次、模型並行性、多階段管道和多模型推理圖編排。

BentoML 可讓您透過自訂業務邏輯、模型推理和多模型組合快速實現自己的 API 或任務佇列。

首先安裝 BentoML 函式庫。

# Requires Python≥3.8
pip install -U bentoml

在service.py檔案中定義 API。

from __future__ import annotations

import bentoml

@bentoml.service(
    resources={"cpu": "4"}
)
class Summarization:
    def __init__(self) -> None:
        import torch
        from transformers import pipeline

        device = "cuda" if torch.cuda.is_available() else "cpu"
        self.pipeline = pipeline('summarization', device=device)

    @bentoml.api(batchable=True)
    def summarize(self, texts: list[str]) -> list[str]:
        results = self.pipeline(texts)
        return [item['summary_text'] for item in results]

在本地執行服務程式碼（預設在http://localhost:3000提供服務）：

pip install torch transformers  # additional dependencies for local run

bentoml serve service.py:Summarization

現在，您可以透過瀏覽器在http://localhost:3000或使用 Python 腳本執行推理：

import bentoml

with bentoml.SyncHTTPClient('http://localhost:3000') as client:
    summarized_text: str = client.summarize([bentoml.__doc__])[0]
    print(f"Result: {summarized_text}")

探索文件以了解更多資訊。

便當 gif

https://github.com/bentoml/BentoML 明星 BentoML 儲存庫 ⭐

LoRAX：多 LoRA 推理伺服器，可擴充至 1000 個經過微調的 LLM。 📡

LoRAX (LoRA eXchange) 是一個框架，允許使用者在單一 GPU 上為數千個微調模型提供服務，從而在不影響吞吐量或延遲的情況下大幅降低服務成本。

它可以及時從 HuggingFace、Predibase 或任何檔案系統動態載入 LoRA 適配器，而不會阻塞並發請求。

LoRAX 利用先進的量化和最佳化技術，例如分頁注意力、快閃注意力、張量並行性和令牌流，以低延遲提供高吞吐量。

首先，您需要 Linux 作業系統和 Cuda 11.8 版相容於裝置驅動程式、Nvidia GPU Ampere 及以上版本。

啟動 LoRAX 伺服器

model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
    ghcr.io/predibase/lorax:main --model-id $model

提示基礎LLMs：

curl 127.0.0.1:8080/generate \
    -X POST \
    -d '{
        "inputs": "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]",
        "parameters": {
            "max_new_tokens": 64
        }
    }' \
    -H 'Content-Type: application/json'

提示LoRA適配器：

curl 127.0.0.1:8080/generate \ 
    -X POST \
    -d '{
        "inputs": "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]",
        "parameters": {
            "max_new_tokens": 64,
            "adapter_id": "vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k"
        }
    }' \
    -H 'Content-Type: application/json'

有關完整詳細訊息，請參閱參考 - REST API 。

洛拉克斯 gif

https://github.com/predibase/lorax Star LoRAX 儲存庫 ⭐

網關：使用單一 API 可靠地路由到 200 個 LLM 🌐

在建立人工智慧應用程式時，我們可能依賴專有的LLMs或某些雲端託管網站提供的LLMs。最好為停電做好準備，因為你永遠不知道會發生什麼。

在這些情況下，您應該將請求從一個提供者路由到另一個提供者。網關是最好的解決方案。

它為 200 多家 LLM 提供者提供統一的 API。它支援快取、負載平衡、路由和重試，並且可以進行邊緣部署以實現最小延遲。

這是建立容錯、強大的人工智慧系統的重要組成部分。它可用於 Python、Go、Rust、Java、Ruby 和 Javascript。

透過安裝來開始使用 Gateway。

pip install -qU portkey-ai openai

對於 OpenAI 模型，

from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="openai",
        api_key=PORTKEY_API_KEY
    )
)

chat_complete = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user",
               "content": "What's a fractal?"}],
)

print(chat_complete.choices[0].message.content)

對於人擇模型，

from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    api_key=userdata.get('ANTHROPIC_API_KEY')
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="anthropic",
        api_key=PORTKEY_API_KEY
    ),
)

response = client.chat.completions.create(
    model="claude-3-opus-20240229",
    messages=[{"role": "user",
               "content": "What's a fractal?"}],
    max_tokens= 512
)

欲了解更多訊息，請存取官方存儲庫。

網關 gif

https://github.com/Portkey-AI/gateway Star Gateway 儲存庫 ⭐

LitServe：用於人工智慧模型的靈活、高吞吐量的服務引擎。 💫

LitServe 是另一個人工智慧模型服務引擎。它針對平行執行進行了高度最佳化，並具有擴展 AI 工作負載的本機功能。我們的基準測試表明，LitServe（基於 FastAPI 建置）比 FastAPI 和 TorchServe 處理更多的並發請求。

LitServe 可以獨立託管在您的電腦上，非常適合喜歡 DIY 方法的駭客、學生和開發人員。

透過 pip 安裝 LitServe（其他安裝選項）：

pip install litserve

定義一個伺服器

這是一個 Hello World 範例（探索真實範例）：

# server.py
import litserve as ls

# STEP 1: DEFINE A MODEL API
class SimpleLitAPI(ls.LitAPI):
    # Called once at startup. Setup models, DB connections, etc...
    def setup(self, device):
        self.model = lambda x: x**2

    # Convert the request payload to model input.
    def decode_request(self, request):
        return request["input"]

    # Run inference on the model, and return the output.
    def predict(self, x):
        return self.model(x)

    # Convert the model output to a response payload.
    def encode_response(self, output):
        return {"output": output}

# STEP 2: START THE SERVER
if __name__ == "__main__":
    api = SimpleLitAPI()
    server = ls.LitServer(api, accelerator="auto")
    server.run(port=8000)

現在透過命令列執行伺服器