Goliath- Developer Guide

Goliath- Developer Guide

Welcome to the Meetrix Goliath Developer Guide! This guide is designed to assist you in seamlessly integrating Goliath into your AWS environment through detailed, step-by-step instructions.

Goliath 120B is a potent language model created by combining two fine-tuned Llama 70B models, boasting 4-bit operation, 4k context, and outperforming GPT-4. With enhanced fp16 performance, it defies RoPE scaling limits and is ready to deploy instantly, featuring API integration and OpenAI compatibility for versatile application.

Blog
How to Install Goliath 120B AI on AWS easily with Meetrix AMI
Seamlessly deploy Goliath AI on AWS with Meetrix’s pre-configured AMI. A robust alternative to GPT-4, Llama, and Mistral for developers in the UK, USA, Europe, Ireland, Singapore, and Thailand, offering scalable AI solutions across industries.

Prerequisites

Before you get started with the Goliath AMI, ensure you have the following prerequisites:

Launching the AMI

Step 1: Find and Select 'Goliath' AMI

  1. Log in to your AWS Management Console.
  2. Navigate to the 'Goliath' in  AWS Marketplace.

Step 2:  Initial Setup & Configuration

  1. Click the "Continue to Subscribe" button.
  2. After subscribing, you will need to accept the terms and conditions. Click on "Accept Terms" to proceed.
  3. Please wait for a few minutes while the processing takes place. Once it's completed, click on "Continue to Configuration".
  4. Select the "CloudFormation Template" as the fulfilment option and choose your preferred region on the "Configure this software" page. Afterward, click the "Continue to Launch" button.
  5. From the "Choose Action" dropdown menu in "Launch this software" page, select "Launch CloudFormation" and click "Launch" button.

Create CloudFormation Stack

Step1: Create stack

  1. Ensure the "Template is ready" radio button is selected under "Prepare template".

2. Click "Next".

Step2: Specify stack options

  1. Provide a unique "Stack name".
  2. Provide the "Admin Email" for SSL generation.
  3. For "DeploymentName", enter a name of your choice.
  4. Provide a public domain name for "DomainName". (Goliath will automatically try to setup SSL based on provided domain name, if that domain hosted on Route53. Please make sure your domain name hosted on route53. If its unsuccessful then you have to setup SSL manually)
  5. Choose an instance type, "InstanceType" (Recommend: g4dn.metal).
  6. Select your preferred "keyName".
  7. Set "SSHLocation" as "0.0.0.0/0".
  8. Keep "SubnetCidrBlock" as "10.0.0.0/24".
  9. Keep "VpcCidrBlock" as "10.0.0.0/16".
  10. Click "Next".

Step3: Configure stack options

  1. Choose "Roll back all stack resources" and "Delete all newly created resources" under the "Stack failure options" section.
  2. click "Next".

Step4: Review

  1. Review and verify the details you've entered.

2. Tick the box that says, "I acknowledge that AWS CloudFormation might create IAM resources with custom names".

3. Click "Submit".

Afterward, you'll be directed to the CloudFormation stacks page.

Please wait for 5-10 minutes until the stack has been successfully created.

Update DNS

Step1: Copy IP Address

  1. Copy the public Ip labeled "PublicIp" in the "Outputs" tab.

Step2: Update DNS

  1. Go to AWS Route 53 and navigate to "Hosted Zones".
  2. From there, select the domain you provided to "DomainName".

3. Click "Edit record" in the "Record details" and then paste the copied "PublicIp" into the "value" textbox.

4. Click "Save".

Access Goliath

You can access the Goliath application through the "DashboardUrl" or 'DashboardUrlIp' provided in the "Outputs" tab.

(If you encounter a "502 Bad Gateway error", please wait for about 5 minutes before refreshing the page)

Generate SSL Manually

Goliath will automatically try to setup SSL based on provided domain name, if that domain hosted on Route53. If its unsuccessful then you have to setup SSL manually.

Step1: Copy IP Address

  1. Proceed with the instructions outlined in the above "Update DNS" section, if you have not already done so.

2. Copy the Public IP address indicated as "PublicIp" in the "Outputs" tab.

Step2: Log in to the server

  1. Open the terminal and go to the directory where your private key is located.
  2. Paste the following command into your terminal and press Enter: ssh -i <your key name> ubuntu@<Public IP address>.

3. Type "yes" and press Enter. This will log you into the server.

Step3: Generate SSL

Paste the following command into your terminal and press Enter and follow the instructions:

sudo /root/certificate_generate_standalone.sh

Admin Email is acquiring for generate SSL certificates.

Shutting Down Goliath

  1. Click the link labeled "Goliath" in the "Resources" tab to access the EC2 instance, you will be directed to the Goliath instance in EC2.

2. Select the instance by marking the checkbox and click "Stop instance" from the "Instance state" dropdown. You can restart the instance at your convenience by selecting "Start instance".

Remove Goliath

Delete the stack that has been created in the AWS Management Console under 'CloudFormation Stacks' by clicking the 'Delete' button.

API Documentation

1. Retrieve Completions

Retrieves completions based on the provided prompt.

  • Endpoint: /v1/completions
  • Method: POST
  • Request Body:
{
  "prompt": "\n\n### Instructions:\nWhat is the capital of France?\n\n### Response:\n",
  "stop": ["\n", "###"]
}
  • Response Body:
{
  "id": "cmpl-738c943d-b6d9-40d3-acdf-26b259372079",
  "object": "text_completion",
  "created": 1704785130,
  "model": "/root/models/goliath-120b.Q5_K_M.gguf",
  "choices": [
    {
      "text": "The capital of France is Paris.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 7,
    "total_tokens": 32
  }
}

2. Retrieve Embeddings

Retrieves embeddings based on the provided input text.

  • Endpoint: /v1/embeddings
  • Method: POST
  • Request Body:
{
  "input": "The food was delicious and the waiter..."
}
  • Response Body:
{
    "object": "list",
    "data": [
        {
            "object": "embedding",
            "embedding": [
            	-0.07521496713161469,
                0.44098934531211853,
                0.6786724328994751,
                ...
            ],
            "index": 0
        }
    ],
    "model": "/root/models/goliath-120b.Q5_K_M.gguf",
    "usage": {
        "prompt_tokens": 11,
        "total_tokens": 11
    }
}

3. Retrieve Chat Completions

Retrieves chat completions based on the provided chat messages.

  • Endpoint: /v1/chat/completions
  • Method: POST
  • Request Body:
{
  "messages": [
    {
      "content": "You are a helpful assistant.",
      "role": "system"
    },
    {
      "content": "What is the primary purpose of version control systems in software development?",
      "role": "user"
    }
  ]
}
  • Response Body:
{
  "id": "chatcmpl-a8f722b1-dddc-4006-bd57-c0dc1b82f85a",
  "object": "chat.completion",
  "created": 1704785453,
  "model": "/root/models/goliath-120b.Q5_K_M.gguf",
  "choices": [
    {
      "index": 0,
      "message": {
        "content": "\n\nVersion control systems (VCS) play a crucial role in software development by managing and tracking changes made to source code over time. The primary purpose of VCS in software development can be distilled down to three main points:\n\n1. **Versioning**: VCS maintains a history of all the changes made to the source code, creating a version-controlled repository. Each change is saved as a new version or commit, allowing developers to easily switch between different states of the project if needed. This feature enables developers to rollback to an earlier version in case of bugs or errors introduced by recent changes.\n2. **Collaboration**: VCS facilitates team collaboration by enabling multiple developers to work on the same codebase simultaneously without overwriting each other's changes. By merging branches or pull requests, team members can integrate their work into a centralized repository, ensuring that everyone is working with the latest version of the code. This feature significantly improves productivity and reduces the chances of conflicts or duplication of efforts.\n3. **Code Management**: VCS provide a centralized and organized approach to code management. Developers can create branches for different features or bug fixes, which can be tested and reviewed before merging into the main branch. This workflow allows developers to work on isolated branches without affecting the stability of the main branch until their changes are fully tested and approved. Additionally, VCS also provide tools for code reviews, issue tracking, and project management, streamlining the entire software development lifecycle.\n\nIn summary, version control systems are indispensable in modern software development because they enable developers to track changes, collaborate efficiently, and manage their codebase more effectively.",
        "role": "assistant"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 39,
    "completion_tokens": 361,
    "total_tokens": 400
  }
}

4. List Models

Retrieves a list of available models.

  • Endpoint: /v1/models
  • Method: GET
  • Response Body:
{
  "object": "list",
  "data": [
    {
      "id": "/root/models/goliath-120b.Q5_K_M.gguf",
      "object": "model",
      "owned_by": "me",
      "permissions": []
    }
  ]
}

Testing the API

  1. Create a directory
  2. Create 3 files (Full codes are given below)
    app.js
    package.json
    .env
  3. Run the following command
    npm install
  4. Edit variable file (.env)
  5. Run the following command
    npm start
  6. You will get the responses
const axios = require('axios');
require('dotenv').config();

const makePostRequest = async (url, data, timeout) => {
  try {
    const response = await axios.post(url, data, { timeout });
    return { success: response.status === 200, data: response.data };
  } catch (error) {
    return { success: false, error: error.message };
  }
};

const makeGetRequest = async (url, timeout) => {
  try {
    const response = await axios.get(url, { timeout });
    return { success: response.status === 200, data: response.data };
  } catch (error) {
    return { success: false, error: error.message };
  }
};

const printResponseData = (endpoint, data) => {
  console.log(`Response for ${endpoint}:`);
  console.log(JSON.stringify(data, null, 2));
  console.log('');
};

const checkEndpoints = async () => {
  const baseUrl = process.env.BASE_URL;
  const model = process.env.MODEL;

  const endpoints = [
    { path: '/completions', method: makePostRequest, data: { "model": model, "prompt": process.env.PROMPT1 }, printEnv: 'PRINT_COMPLETIONS_RESPONSE' },
    { path: '/embeddings', method: makePostRequest, data: { "input": process.env.PROMPT2, "model": model }, printEnv: 'PRINT_EMBEDDINGS_RESPONSE' },
    { path: '/chat/completions', method: makePostRequest, data: { "messages": [{ "content": "You are a helpful assistant.", "role": "system" }, { "content": process.env.PROMPT1, "role": "user" }], "model": model }, printEnv: 'PRINT_CHAT_COMPLETIONS_RESPONSE' },
    { path: '/models', method: makeGetRequest, printEnv: 'PRINT_MODELS_RESPONSE' }
  ];

  for (const endpoint of endpoints) {
    const url = `${baseUrl}${endpoint.path}`;
    const { success, data, error } = await endpoint.method(url, endpoint.method === makePostRequest ? endpoint.data : null, process.env.REQUEST_TIMEOUT || 50000);
    const printResponse = process.env[endpoint.printEnv] === 'true';

    if (success) {
      console.log(`*** Endpoint ${endpoint.path} is reachable.`);
      if (printResponse) {
        printResponseData(endpoint.path, data);
      }
      console.log('');
    } else {
      console.log(`*** Endpoint ${endpoint.path} is not reachable. Error:`, error);
    }
  }
};

checkEndpoints();
app.js
{
  "name": "test-llama",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "start": "node app.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "axios": "^1.6.7",
    "dotenv": "^16.4.1"
  }
}
package.json
# Base URL for the API
BASE_URL=https://mixtral-test-prod.meetrix.io/v1

# Model to be used in requests
MODEL=mixtral-8x7b-instruct-v0.1

# Prompts for different endpoints
# /completions and /chat/completions
PROMPT1=What is the capital of France?
# /embeddings
PROMPT2=The food was delicious and the waiter...

# Whether to print responses for each endpoint
PRINT_COMPLETIONS_RESPONSE=true
PRINT_EMBEDDINGS_RESPONSE=false
PRINT_CHAT_COMPLETIONS_RESPONSE=true
PRINT_MODELS_RESPONSE=true

# Timeout for requests in milliseconds (default is 50000)
REQUEST_TIMEOUT=50000
.env

Check Server Logs

Step1: Log in to the server

  1. Open the terminal and go to the directory where your private key is located.
  2. Paste the following command into your terminal and press Enter:
    ssh -i <your key name> ubuntu@<Public IP address>

3. Type "yes" and press Enter. This will log you into the server.

Step2: Check the logs

sudo tail -f /var/log/syslog

Upgrades

When there is an upgrade, we will update the product with a newer version. You can check the product version in AWS Marketplace. If a newer version is available, you can remove the previous version and launch the product again using the newer version. Remember to backup the necessary server data before removing.

Troubleshoot

  1.  If you face the following error, please follow https://meetrix.io/articles/how-to-increase-aws-quota/ blog to increase vCPU quota.

2.  If you face the following error (do not have sufficient <instance_type> capacity...) while creating the stack, try changing the region or try creating the stack at a later time.

3. If you face the below error, when you try to access the API dashboard, please wait 5-10 minutes and then try.

Conclusion

For a smooth integration of Goliath 120B into your AWS environment, the Meetrix Goliath Developer Guide is your best solution. Regardless of your level of experience, this guide provides clear, detailed guidelines for developers. Combining two optimized Llama 70B models creates Goliath 120B, which outperforms GPT-4 with improved fp16 performance, 4k context, and 4-bit operation. It has OpenAI compatibility and API connectivity for flexible applications, and it is ready for immediate deployment. With the Meetrix Goliath Developer Guide, you can confidently simplify your language processing responsibilities.

Technical Support

Reach out to Meetrix Support (support@meetrix.io)  for assistance with Mixtral issues.

Discover Seamless Meetings with >>>
Meetrix