Update API Doc For Text-to-Speech

Apr 19, 2025 by ADMIN 34 views

Introduction

In this article, we will be updating the API documentation for the Text-to-Speech feature in Pollinations. The current documentation contains a mistake, which we will address in this update. We will also provide a detailed explanation of the correct API endpoint, payload structure, and developer role description.

Correcting the API Endpoint

The current documentation for the Text-to-Speech feature is located at https://github.com/pollinations/pollinations/blob/master/APIDOCS.md#text-to-speech-post---openai-compatible-%EF%B8%8F%EF%B8%8F. However, this documentation contains a mistake. The correct API endpoint for the Text-to-Speech feature is a GET request to the URL https://text.pollinations.ai.

Correct Payload Structure for OpenAI Compatible API

For the POST method that uses the URL https://text.pollinations.ai/openai, the payload structure is as follows:

{
  "model": "openai-audio",
  "modalities": ["text", "audio"],
  "audio": {
    "voice": "allow",
    "format": "pcm16"
  },
  "messages": [
    {
      "role": "developer",
      "content": "You are a versatile AI"
    },
    {
      "role": "user",
      "content": "Convert this longer text into speech using the selected voice. This method is better for larger inputs."
    }
  ],
  "private": false // optional
}

Developer Role Description

The developer role description is used to control the AI's behavior. This description is used to specify the role of the developer in the conversation. In the example above, the developer role is used to specify that the AI should respond to the user's input.

Correcting the API Documentation

To correct the API documentation, we need to update the following sections:

Text-to-Speech GET Request: Update the API endpoint to https://text.pollinations.ai.
Text-to-Speech POST Request: Update the payload structure to match the correct structure for the OpenAI compatible API.
Developer Role Description: Add a description of the developer role and its purpose in controlling the AI's behavior.

Updated API Documentation

Here is the updated API documentation for the Text-to-Speech feature:

Text-to-Speech GET Request

Endpoint: https://text.pollinations.ai
Method: GET
Description: This API endpoint is used to generate text-to-speech audio using the OpenAI model.

Text-to-Speech POST Request

Endpoint: https://text.pollinations.ai/openai
Method: POST
Payload Structure:

{
  "model": "openai-audio",
  "modalities": ["text", "audio"],
  "audio": {
    "voice": "allow",
    "format": "pcm16"
  },
  "messages": [
    {
      "role": "developer",
      "content": "You are a versatile AI"
    },
    {
      "role": "user",
      "content": "Convert this longer text into speech using the selected voice. This method is better for larger inputs."
    }
  ],
  "private": false // optional
}

Description: This API endpoint is used to generate text-to-speech audio using the OpenAI model. The payload structure includes the model, modalities, audio settings, messages, and private flag.

Developer Role Description

Description: The developer role is used to specify the role of the developer in the conversation. This role is used to control the AI's behavior and is an essential part of the Text-to-Speech feature.

Conclusion

Frequently Asked Questions

In this article, we will be addressing some of the most frequently asked questions about the Text-to-Speech API in Pollinations. We will cover topics such as the API endpoint, payload structure, developer role description, and more.

Q: What is the correct API endpoint for the Text-to-Speech feature?

A: The correct API endpoint for the Text-to-Speech feature is a GET request to the URL https://text.pollinations.ai.

Q: What is the payload structure for the Text-to-Speech POST request?

A: The payload structure for the Text-to-Speech POST request is as follows:

{
  "model": "openai-audio",
  "modalities": ["text", "audio"],
  "audio": {
    "voice": "allow",
    "format": "pcm16"
  },
  "messages": [
    {
      "role": "developer",
      "content": "You are a versatile AI"
    },
    {
      "role": "user",
      "content": "Convert this longer text into speech using the selected voice. This method is better for larger inputs."
    }
  ],
  "private": false // optional
}

Q: What is the purpose of the developer role description?

A: The developer role description is used to control the AI's behavior. This description is used to specify the role of the developer in the conversation.

Q: Can I use the Text-to-Speech feature with other models?

A: Yes, you can use the Text-to-Speech feature with other models. However, you will need to update the payload structure to match the specific model you are using.

Q: Is the Text-to-Speech feature available for all users?

A: No, the Text-to-Speech feature is only available for users who have a valid API key. You will need to obtain an API key from Pollinations to use this feature.

Q: Can I use the Text-to-Speech feature for commercial purposes?

A: Yes, you can use the Text-to-Speech feature for commercial purposes. However, you will need to obtain a commercial license from Pollinations to use this feature for business purposes.

Q: How do I troubleshoot issues with the Text-to-Speech feature?

A: If you are experiencing issues with the Text-to-Speech feature, you can try the following:

Check the API endpoint and payload structure to ensure that they are correct.
Verify that you have a valid API key.
Check the status of the API request to ensure that it is successful.
Contact Pollinations support for further assistance.

Conclusion

In this article, we addressed some of the most frequently asked questions about the Text-to-Speech API in Pollinations. We covered topics such as the API endpoint, payload structure, developer role description, and more. If you have any further questions or need assistance with the Text-to-Speech feature, please contact Pollinations support.

Set Clean Zones

Apr 19, 2025 15 views

How Do I Create This Shape?

Apr 19, 2025 27 views