Skip to content

codegeex2-6b-int4推理卡死,不输出任何内容 #269

@LBJ6666

Description

@LBJ6666

用demo推理卡死,不输出任何内容。

prompt = "# language: Python\n# write a bubble sort function\n"
inputs = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_length=256, top_k=1)
response = tokenizer.decode(outputs[0])
print(response)

并且提示警告:

modeling_chatglm.py:227: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
  context_layer = torch.nn.functional.scaled_dot_product_attention(query_layer, key_layer, value_layer,

请问下和这个警告有关系吗?

版本:transformers==4.30.2, protobuf==4.24.4, cpm-kernels==1.0.11, torch== 2.3.1+cu121

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions