Skip to content

Tgi llamacpp#147

Open
fgbelidji wants to merge 32 commits intoawslabs:mainfrom
fgbelidji:tgi-llamacpp
Open

Tgi llamacpp#147
fgbelidji wants to merge 32 commits intoawslabs:mainfrom
fgbelidji:tgi-llamacpp

Conversation

@fgbelidji
Copy link
Copy Markdown

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@fgbelidji fgbelidji requested a review from a team as a code owner April 17, 2025 15:46
"TGI": ["GPU", "INF2"],
"TEI": ["GPU", "CPU"],
"TGILLAMACPP": ["CPU"],
"TGILLAMACPP": ["GPU", "CPU"],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I thought it's just a CPU image.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and ideally for Graviton instances there should be a different container to maximize the performance, see https://huggingface.co/docs/text-generation-inference/backends/llamacpp#build-docker-image

ENTRYPOINT ["./entrypoint.sh"]
CMD ["--json-output"]

LABEL dlc_major_version="2"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dlc_major_version = "2" ? shouldn't it be "1" considering that we are contributing it for the first time?

ENTRYPOINT ["./entrypoint.sh"]
CMD ["--json-output"]

LABEL dlc_major_version="2"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

@@ -0,0 +1,21 @@
#!/bin/bash
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct me if I am wrong but this is because of the vulnerability right? I'll take an AI to check with hosting platform if they have got any long term remediation for this vulnerability or not.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding is, this is an overhead every time we are gonna add a new image right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed that's for the vulnerability, and yes this is a an overhead for all the new cuda based images

Comment thread releases.json
{
"framework": "TEI",
"framework": "TGILLAMACPP",
"device": "gpu",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont you want to add both CPU and GPU images?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, tested both independently locally, but will ad that to build the two

@malav-shastri
Copy link
Copy Markdown
Member

just for sanity, what would be the final image tag here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants