Skip navigation

disambiguation page #Software_and_information_technology

warning: Creating default object from empty value in /var/www/vhosts/ on line 33.

Hi, I'm working on the National Vulnerability Database (NVD). I want to categorise the vulnerable software by category. I already have the categories and a good training set to feed into a machine learning algorithm.

The original idea was to use the description of the vulnerability in NVD to categorise the software, but this won't obviously work (because it doesn't describe the software).

Then we thought to download the first paragraph of the Wikipedia entry for that software. This works only 10% of the time, as many entries do not match any page. This is an example of a page that cannot load Further manual google queries seem to identify that software as a VOIP server. In some other cases, e.g. for the software Swift, the returned page is definitely not related to the software, and in the disambiguation page#Software_and_information_technology) it is not even clear which entry should be the one of interest.

Do you have suggestions to mitigate this problem? More reliable software-related databases other than wikipedia? Better ways to query the dataset instead of feeding the bare software name provided by NVD (e.g. up-ux_v, vendor:Nec)? Ways to include the vendor in the query, so to make the results more reliable?

Ever faced a problem like that?


submitted by mailor
[link] [1 comment]

Your rating: None