Search code examples
Why does TFLite INT8 quantization decompose BatchMatMul (from Einsum) into many FullyConnected layer...


tensorflowonnxquantizationtfliteeinsum

Read More
Straight-Through estimation for vector quantization inside a recurrent neural network...


tensorflowquantization

Read More
Quantize Image using PIL and numpy...


image-processingpython-imaging-libraryquantization

Read More
RuntimeError: CUDA error: named symbol not found when using TorchAoConfig with Qwen2.5-VL-7B-Instruc...


pythonpytorchhuggingface-transformershuggingfacequantization

Read More
What is the difference, if any, between model.half() and model.to(dtype=torch.float16) in huggingfac...


pythonhuggingface-transformershuggingfacequantizationhalf-precision-float

Read More
How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?...


pythonnlphuggingface-transformershuggingfacequantization

Read More
How to quantize a HF safetensors model and save it to llama.cpp GGUF format with less than q8_0 quan...


large-language-modelhuggingfacequantizationllamacpp

Read More
Why are model_q4.onnx and model_q4f16.onnx not 4 times smaller than model.onnx?...


deep-learninglarge-language-modelhuggingfaceonnxquantization

Read More
HuggingFace - 'optimum' ModuleNotFoundError...


pythonhuggingface-transformersquantizationmodulenotfounderrorpruning

Read More
Does static quantization enable the model to feed a layer with the output of the previous one, witho...


neural-networkartificial-intelligenceonnxquantizationstatic-quantization

Read More
Speeding up load time of LLMs...


huggingface-transformerslarge-language-modelquantization

Read More
Llama QLora error: Target modules ['query_key_value', 'dense', 'dense_h_to_4h&#3...


pythonquantizationlarge-language-modelpeft

Read More
How to set training=False for keras-model/layer outside of the __call__ method?...


tensorflowkerastransfer-learningquantizationtfmot

Read More
Diffrence between gguf and lora...


large-language-modelquantizationpeft

Read More
Quantization 4 bit and 8 bit - error in 'quantization_config'...


gpulocallarge-language-modelquantization8-bit

Read More
Quantization and torch_dtype in huggingface transformer...


huggingface-transformershuggingfacequantization

Read More
Image quantization with Numpy...


pythonnumpyquantization

Read More
jpeg python 8x8 window DCT and quantisation process...


pythonhuffman-codequantizationdct

Read More
What's an elegant way to avoid "hopping" quantization errors when graphing a divergent...


c++qtsamplinggraphingquantization

Read More
There exists ONNX or Tensorflow CNN 4-bit quantized models available?...


tensorflowkerasonnxquantization

Read More
What is the mathematical definition of the quantile transformation in xgboost.QuantileDMatrix?...


pythonmachine-learningxgboostquantilequantization

Read More
Quantizing normally distributed floats in Python and NumPy...


pythonnumpyfloating-pointk-meansquantization

Read More
Tensorflow quantization process in detail - Anyone don't talk about this in detail...


pythontensorflowtensorflow2.0tensorflow-litequantization

Read More
ValueError: Unsupported ONNX opset version: 13...


pythonpytorchonnxquantizationonnxruntime

Read More
NeuQuant.js (JavaScript color quantization) hidden bug in JS conversion...


javascriptneural-networkquantization

Read More
How to quantize inputs and outputs of optimized tflite model...


pythontensorflow-litequantizationgoogle-coral

Read More
torch Parameter grad return none...


pythondeep-learningpytorchquantization

Read More
How do you find the quantization parameter inside of the ONNX model resulted in converting already q...


onnxyolov5quantizationtf2onnx

Read More
Why are some nn.Linear layers not quantized by Pytorch?...


pytorchquantizationstatic-quantization

Read More
Method to quantize a range of values to keep precision when signficant outliers are present in the d...


pythonprecisionoutliersquantizationdata-transform

Read More
BackNext