interesting approach! my concerns are about the final results though - it doesn't capture the context in which things are said and WHEN things are said (for e.g. poll over 30% vs poll over 48% - where they said at the same time, if not how far apart?), these additional features would contribute to actually figuring out the inconsistencies (for e.g. the first example of $6mil USD coming through). It'd be great if we are able to add capturing these additional things in the aforementioned pipeline
hey can you confirm that does it used sentence transformer embeddings models or openai embeddings model? Because as stated in the jian ai blogs one of their embedding models is trained on the constrative examples to solve the issue. Then the difference wil be large i guess
interesting approach! my concerns are about the final results though - it doesn't capture the context in which things are said and WHEN things are said (for e.g. poll over 30% vs poll over 48% - where they said at the same time, if not how far apart?), these additional features would contribute to actually figuring out the inconsistencies (for e.g. the first example of $6mil USD coming through). It'd be great if we are able to add capturing these additional things in the aforementioned pipeline
I wonder what happens when people do this for Biotech/ QC/ Medicine, domains that are superspecialized and hard to debunk.
hey can you confirm that does it used sentence transformer embeddings models or openai embeddings model? Because as stated in the jian ai blogs one of their embedding models is trained on the constrative examples to solve the issue. Then the difference wil be large i guess
https://jina.ai/news/text-embeddings-fail-to-capture-word-order-and-how-to-fix-it