recogmission

Archive for the ‘Uncategorized’ Category

Update to the dicsussing issues with adult content for web businesses

In Uncategorized on June 7, 2009 at 7:57 pm

We can see a number of cases demonstrating that there are many issues with delivering adult content in the Web. From latest, I would like to highlight following two:

New Microsoft search engine Bing.com is perfect if you want to see the porn multimedia: http://www.theregister.co.uk/2009/06/01/bing_hits_uk/

The ISP Triple Fibre Network has been closed by FTC for multiple violations, including spam and child porn http://www.ftc.gov/opa/2009/06/3fn.shtm .

Regarding last article: if you have the host at some hosting provider or ISP, are you sure you are safe? ISP can be closed at any time, and all your resources will be unaccessible.

Picollator and the big competition: Update

In Uncategorized on May 22, 2009 at 2:01 am

Due to the number of announces about new start-ups offering ‘image search’ or ‘automatic image tagging’, ‘new opportunities for customers’ and ‘new approach to web search’, a user is hesitating what to choose. I would like to provide the list of simple questions, which the user should ask evaluating new offers.

Q1: Does it (the announced service) have the online demo?
No=0, Yes=1

Q2: Does it allow to put a user’s query to try?
No=0, Yes=1

Q3: Does it allow to submit user’s picture to search?
No=0, Yes=1

Q4: Does it work with grey-scale images and good artworks?
No=0, Yes=1

Q5: Does it allow to submit picture and text together to search?
No=0, Yes=1

Q6: Does it search for similar objects rather than identical?
No=0, Yes=1

Now please calculate the Sum from all answers. If you obtain Sum=6, pay attention on this new service.

Comment on ‘Walking the Cyberbeat’ by Newsweek

In Uncategorized on May 5, 2009 at 10:26 am

Newsweek has published an interesting article ‘Walking the Cyberbeat’ about fighting against the adult and other unwanted content at Facebook and its biggest competitor MySpace. The article is very fresh and its reflects main issues with content moderation at social networks. However, one important thing attracted my attention. Nick Summers, an author of that article, made an example on the typical work of the content moderator at Facebook. The example contains following phrase:

After delivering a verdict on 75 of the 438,848 outstanding photos flagged by Facebook users… …Axten is off to a meeting. It’s just another day at the office of the world’s fastest-growing social-networking site.

And another one:

The 26-year-old Stanford grad is one of some 150 people the young company employs to keep the site clean—out of a total head count of 850.

In my words, it means following from the point of view of CEO, COO, CFO and CMO:

  1. 150 people are paid regularly for doing the manual work and Facebook pays all taxes, social insurance, operational expenses and overheads for the team of 150 people.
  2. No doubts, 150 people can do much more for Facebook than doing that very monotonous work
  3. 150 of 850 people are in content moderation, which means that more than 17% of the whole team is making this strange manual work.
  4. Those 150 people can moderate 150*75=11,250 images of  more than 400,000 outstanding images, which is less than 3% of daily demands. If you know ISO9001, it is 3x times less than 10% :)
  5. The results of their work are far from what is good, because of the performance and because they process images which flagged by Facebook users. How many other images among billions at Faceboook are not flagged?

I am not criticising Facebook. Same issues exist in all other companies running public web services, including MySpace. It reminds the history of the industrial progress: in 18 century most of the work has been done by people. Then machines arise. It is very strange that managers in IT still think in terms of 18 century.

So I would recommend a simple thing for them: just count everything and finally think that maybe it is better to use piFilter as the automatic detection tool, which covers 100% and does not need social taxes, overheads and operational expenses.

Adult image detection service piFilter is ready

In Uncategorized on May 1, 2009 at 2:11 pm

You may plug-and-play with our new porn detection service www.pifilter.com . As I reported before, porn and adult content in images can be one of the big issues for online media, ISPs and search engines because it cannot be detected automatically. Now the time is changing, and we installed the online service which you can easily use.

pifilter website capture

pifilter website capture

If you have a social network, online albums, file sharing or provide ISP, you may set up your system to query our engine on each image being viewed or uploaded in your service. piFilter provides the simple API to send the query and receive the response in the form like ‘porn’, ‘no porn’ and ’suspicious’. It does not filter out images – it works much better offering the user the opportunity to manage the situation by himself. 

The technology is a part of our strategic development with www.picollator.com, so make sure that we implemented best pattern recognition algorithms to analyze images by piFilter. The service does not analyse text (maybe, yet!), it recognizes image content itself, so it is 100% language-independent and can work at English, Spanish, Chinese, Russian, German, Korean, Estonian – at any website.  piFilter does not care about the language, because it sees the image, not reads the texts.  Such advantage is one of the key feature for most of social networks, blogs and ISPs, where there are many pageviews or uploads with no associated text content. The quality of the detection is above 80% and piFilter can process all image sizes and types in the Internet.

piFilter works by the analogy with 99% of manual content moderators, but performs much faster and does not need overheads, administration and operational expenses, and it is much cheaper in terms of direct costs. The commercial model reminds best cases of SaaS – you pay only for usage, not for time! So you subscribe to the particular traffic in number of requests, and may spend this traffic as long as you want. Something like pre-paid cellular tariffs, isn’t it? piFilter tariffs are made so any small or average site owner benefits up to 10x times subsribing to piFilter. Especially if calculate all direct or indirect expenses on content moderation or legal issues.

It is the unique market offer, so try it and be flexible and safe.

Can adult content become the problem. Part 2.

In Uncategorized on March 27, 2009 at 2:44 pm

Since I have posted the previous post, I got some public and private responses, and had some conversations with people involved into the web business. Therefore I decided to post another comment to cover some issues.

The problem addressed looks unusual just from the very first look. Probably, you did not experienced problems with adult content before or just did not take into into consideration. However, if you consider all agruments, including direct and indirect costs of managing the content,  legal spendings, the potential risks of being under some lawsuits and the corresponding losses, then the issue become more serious.

The case described in my first post has more relations to the websites. However, I am sure that many ISPs may suffer from customers claims to stop porn. Porn can be textual and visual, not necessary both. In many cases ISPs receive addresses of porn sites from 3rd parties, but is it good enough? Can such addresses like facebook.com, or myspace.com, or msn.com be in the black list? Never. Because it is not right to put them in the blacklist. Because they are really good services and many people want them working. And they are really big… Really big, and containing everything. Including what you or your children should not see. So how ISP can deal with public services not in the blacklist? There is the only way out: using independent software for estimating porn in multimedia, because it is language independent and pure approach. None of text-based filters can protect ISP subscriber from non-textual resources. BTW, we are ready to provide the complex solution: image-based filter and semantic text-based filter in one box. The semantic text filter will not ban Wikipedia for very much artificial case, it will understand the meaning. The image filter will not allow anonimous images with porn to be transferred via your network. So this is the case for ISPs as well.

OK, you may say that the porn is one of the biggest businesses in the World. Some (or many?) of your clients want to see the adult content. No problem, it is your business, so your software may decide what to do with the estimations obtained from the Recogmission filter. Remember: we do not block the content. Instead, we just generate the ratio of porn potential in each image for your convenience.

Thus, for each estimation by pur filter, your software may

  • block porn images
  • send the alerts to the few content managers to deal with it
  • customize the content delivery to the particular groups of customers – maybe some them wants to see porn only :)

I belive that each person above 18 has own rights either to recieve adult content, or do not receive it. Your duty for him is to ensure that his rights are satisfied. For your business it means that most of people will like it, because it is flexible, efficient and right-protective at the same time.

Remember: if we can not see the problem, it does not mean that the problem is absent. It might hurt you when you never estimate.  So just ask us about the visual filter service.

Can visual adult content become the problem for your business and how to manage it?

In Uncategorized on March 26, 2009 at 10:57 am

Adult content, especially images, can be a real problem for any public business like social networks, blogs, online photo albums, ISPs, mobile networks, etc. It could be the problem for corporate processes as well…

Everybody knows about last scandals with adult content in public web services:

Of course, the discussion about the porn and related things is very old. No doubts, some people love porn, erotics, etc. I think they have rights to love it. However, the big issue is that if you have the public web site, you can have many different users, and some of them do not want to see adult pictures. Some of them are under 18, some others are religious guys, others can be parents, and so on. Of course you post legal statements and service agreement about that ‘you have to be 18′, ‘porn content is not allowed’, etc. I am sorry – who among web users read legal rules carefully before registering in the social network? Maybe some of them…

It would not be the big problem, but consider following case. A person who just wants to demonstrate something unusual made posts with adult images into your social network, online photo album or blog. Another person, who is under 18 has found those pictures. Fortunately for his menthal health – but unfortunately for your business! – his parents discovered this case. Maybe they would just tell him that it is no good… Then you are lucky, indeed. However, they can claim to the court that you propagate adult content. Look, lets pray if that it is not the child sexual abuse case! Such claims can destroy your business or at least make your life harder.

You say: “OK, I know about all of that. I am moderating my content and I am using text filter.” My congratulations, because you are clever. However, in most cases user-generated adult content is not described in typical words. Text filters fail distinguishing ‘porn’ vs ‘against porn’, so I will not be surprised if this posting will be banned by some text filters. Also, consider your expenses for the content moderation. If you have the average website with 10,000 image uploads daily, then you have to use at least 12 content managers to browse all of that. 12 people needs to be paid weekly or monthly. You can calculate the costs by yourself. You say: “we do not moderate all content items. Just some of them”. You may say: “We ask users to report about such cases”. Ask users, of course… Do not moderate all content items (BTW, you do the part of work because you cannot afford doing the whole work!). It is your choice. You can estimate your direct and indirect expenses anyway, and you will stay under the permanent stress because of the high probability that your guys missed one damned image with something which is legally improper.

So what I propose: we have developed the visual content filter, which can estimate the porn in images. You can connect your service to that engine via our API, send request on each image and obtain our estimation: ‘porn’, ‘neutral’, or ’suspicious’. Then your service may decide itself what to do with that image.  The filter is based on some of pattern recognition achievements we developed for Picollator.com . For instance, it can detect nude human bodies, some elements of it in different positions and recognize the porn in 75% of all cases. The engine can be hosted at our servers, or – if you have some particular demands – can be licensed for deployment in your environment. The solutions works for everybody: social network, online photo, blog, ISP, corporate network, etc. If you wants us to made all work, we can integrate the filter into your service with a pleasure. It will cut your cost, improve your brand and let you be more relaxing at least somehow.

Be sure – it is much (!) cheaper, than spending your cash for the whole crowd of  moderators, your time on management and your health on lawyers :) You may read the additional info about the filter and contact us.

Financial crisis is here. Picollator is a good stuff. Why?

In Uncategorized on October 21, 2008 at 9:02 pm

Everybody is talking about the global financial crisis and how it affects the lifestyle. Many investment issues arise and marketing reports say that everything will drop. The awful picture, isn’t it? However, there are many reasons why Picollator is a cool project and why it will be growing even in case of crisis.

1 Picollator is about the search. Search is forever, and people will search independently of the crisis.

2 Picollator is for the next generation of search. Next generation means that it takes global trends, which cannot be stopped by the financial gaps, as they are driven by the civilization. The trends demonstrates that the search is not just the text. The search is not just images. The information and search are everything with text, multimedia, etc = content.

3 Picollator is a future for mobile users. They should not use the keyboard too much using next releases of picollator.mobi and apps. The search can be performed contextually, by any viewed or stored content and using one button. Mobile users will support the technology market during next years.

4 Picollator is the really high-tech development. It does not bring profit tomorrow, because it should not – it is the financial crisis, or did you forget? Very well, it will bring profit in 3 years, when users will be feeling better and B-customers will compete for those users. So just make the development over bad times, and be on top at good times.

5 Picollator delivers fun and provides the entertainment, because it works with multimedia directly. People loves fun, and they loves fun even more if they do not have cash for other things.

So be with us, and you will not miss things.

Picollator mobile came to your device

In Uncategorized on September 22, 2008 at 8:35 pm

We have finished some tests with the new version of Picollator for mobile search. In the past, it worked at the main website, but we moved it to the fashionable http://picollator.mobi domain. Now it works better on most of mobile devices and you can try it directly from your cell phone or communicator. We have made some changes in the interface to simplify it, and now it works on most of Nokia phones and Windows Mobile communicators. Although Picollator.mobi is pretty good in many of standard browsers, we recommend you using the latest Opera Mini browser, as it works fast and deliver pleasant mobile experience. Please feel free to use our contact form to report on bugs.

Mixed data search results are correct, or A ball is not a lady

In Uncategorized on July 29, 2008 at 8:25 pm

At some blogging website, we have found an example of the ‘error’ in recognizing pictures at Picollator. The author has submitted the ball image together with the text query ‘ball’. Then he reported that Picollator recognized the ball as the lady’s face. Look at this screen shot below:

No similar images found there. Just text search works.

No similar images found there. Just text search works.

If you pay attention on the screen shot, you can find that:

  1. A tomato sign in the upper right corner of the input image of the ball is red. It means that the user request contains that image as the sample
  2. The text ‘ball’ is presented in the query string

Image of the ball did not return any results, because the recognition algorithms distinguish faces good enough. I should say that the results can be found by image OR text query. In our case Picollator has found portraits of that lady because her name contains ‘Ball’, i.e., by text query.

This is like when you submit long textual query to the conventional search engine, and it returns results with just a few words as not all words are found. Here, Picollator did the same. It did not find similar images, but retrieved images by textual user input.

How can you know which webresource is found by which data type? Of course, we are working on the interface issue, as it is not easy to demonstrate results by mixed request. At this moment, you may use one of the following ways:

  1. Move the mouse cursor over the resource (image) and look at the pop-up. It contains either one found image, or submitted image and found one together. If there is only one picture, then it is found by text, which is displayed below. In case of two images first one is found by similarity with the submitted sample.
  2. Move the mouse over the input image. If you locate the mouse in the facial area, you can see all similar images in the list of results if some of them are found by visual similarity. Same functionality works if you move the mouse over the input image face in the previous case.
  3. You may use tomato sign to switch on and off the image-based search.

For more info, read my previous posts:

http://recogmission.wordpress.com/2008/04/17/identified-faces-are-highlighted-now/

http://recogmission.wordpress.com/2008/07/07/text-web-search-in-picollator/

Sorry, I have noticed that I did not say anything about this very new image+text interface with tomato before… Please give me more time.

Text search obtains semantics

In Uncategorized on July 9, 2008 at 3:13 pm

As I stated in my past posting, we implemented the universal search at www.picollator.com . Using text, images or all of those together you can find web pages and images. Now I am writing a note about that intriguing ability of Picollatorto find the data using text much better than others. As I wrote, Picollator now understands the correspondence between visual objects and text submitted and indexed. Let me publish a couple of examples.

1 I put a simple text request with the word ‘barbara‘, which is just a female name, nothing else. With Picollator, I obtained following results:

text search using the word barbara at picollator.com

text search using the word barbara at picollator.com

Compare them to ones from famous conventional text search system:

text search using the word barbara at other web search system

text search using the word barbara at other web search system

Put a closer look. First one contains results ordered not just by the word ‘barbara’, but also by similarity between faces inside images. Last one contains just arbitrary or unknown ranking, while Picollator delivers most valuable images first. 
2 I put a simple text request with the word ‘sam‘, which is just a male name. With Picollator, I obtained following results:
text search using the word sam at picollator.com

text search using the word sam at picollator.com

Again, you can see the tendency: similar faces are displayed first and by groups in a contrast to disordered results of the corresponding search from other search engine:
text search using the word sam at other search engine

text search using the word sam at other search engine

It is not just because we collect similar images and faces one by one – it is practically unrealistic in the scale of WWW. Picollator includes the technology which allows to compare visual objects with words, i.e., it uses semantics. It does it without any formal algebra approach, just because Picollator is able to visually compare pictures and objects, as well as collect and index text data.
Of course, it is the very first release of that tech, but it will grow, the database will become bigger, and you will see the web search engine with real A.I.. Maybe, it is alive already. Sometimes I feel like if it would be alive.

Text web search in Picollator

In Uncategorized on July 7, 2008 at 10:16 pm

We have made a new release of www.picollator.com with the advanced text search. As you have seen, we made tags as key text to find a content. However, the new generation allows very simple text search using the input string as all web search engines do. Of course, it is not so powerful yet. However, it is a little bit more than just text-based search. First of all, you can search using text input (as you probably use to do in Google, Yahoo, MSN, Ask, etc.). Second, you can search using visual input as Picollator allowed initially. Third, you can search using image and text as the query at the same time, and it is the thing Picollator offers as the technology pioneer again. The ranking is performed across all resources found depending on image-based ratios and text-based ratios together. There is a very small thing about the text search I have to tell you about. This text search is not just a regular text search like ranked text index results. It depends on the visual similarity of some data processed, even if you use only text to search. Picollator is able to – please pay attention on what I am saying! – Picollator is able to understand the visual meaning of some words. It does not mean that it just count the words attached to the pictures or webpages. Its engine also uses some estimation of how text can fit visual similarity of known objects.

In my next posting I will publish some advises on using that new release.

Picollator web search recognized a drawing – watch the video at Youtube

In Uncategorized on June 8, 2008 at 3:14 pm

As I described in one of my recent posts, we have found that practically we achieved results comparable with human intelligence in terms of generalization of pattern recognition. Picollator is able to find the photo by the artwork. You may watch the video containing true example of how Picollator has found a real photo of a young lady using a drawing being made in real time, at Youtube http://www.youtube.com/watch?v=ZfVgH56TRkI . Read the rest of this entry »

New search concept at Picollator: find web pages by image or text

In Uncategorized on May 5, 2008 at 9:33 pm

As I promised some days ago, we are opening something new, which certainly differs from any other service. I mean that now you are able to find a web resource by image and/or some text. I mean now that you were able to find relevant pictures by images, and now you can find web pages by images as well and have the results ranked. Read the rest of this entry »

Have you seen other engine which is able to work with artworks?

In Uncategorized on April 17, 2008 at 5:45 pm

By the way, we found (thanks to site visitors!) an interesting example. www.picollator.com is able to find similar images by the artwork. Look at the picture below:

Real photo found by the artwork

Identified faces are highlighted now

In Uncategorized on April 17, 2008 at 5:26 pm

 As I promised, we installed the new interface of http://www.picollator.com with the option to see who resembles to your search query in results.

1. Point at any picture in the results.

2. You will see the pop-up window.

3. Move the mouse in the area of any identified face inside your picture.  All corresponding faces in the image from results will be highlighted by the white square. See the example below: 

Left face is similar to small face from the complex picture on the right.

News about what expect from Picollator in weeks

In Uncategorized on April 10, 2008 at 11:28 pm

Technorati Profile
Sorry, I was not posting something too long. So I suppose many of you have questions, writing and reading other posts about http://www.picollator.com . Many people discuss if it works fine or not. I found many questions already commented in my past messages.

For instance, one of typical mistake is to compare Picollator.com with some celebrity matching services. I recognize them, but please do not make a mixture: Picollator is the honest search engine which makes index from everything taken automatically from the web. Our people do not do any manual work with it except monitoring, changing algorithms and managing resources. If you can see celebrities in results, it just means that famous people are presented in the web more often than others. Therefore they come to our index with more probability because the index is not quite big yet. Celebrity matching websites work upon the prepared database consisting of celebrity photos only. It is a big difference, isn’t it? The difference is like it was between search machines and directories at the Zero stage of the web search.

Read the rest of this entry »

New release of Picollator.com

In Uncategorized on March 11, 2008 at 9:28 pm

We installed the new version of www.picollator.com . Now with faster algorithms and keywords also to help you navigate.

Advantages of multimedia search vs text search: you do not need speak a language

In Uncategorized on March 4, 2008 at 8:13 pm

Let us imagine you want find something about … Aishwaraya Rai and corresponding movies. Sorry for mentioning that pretty actress as an example, but there is always an example. Ok, get back to the issue.

Aishwaraya Rai. If you are a fluent English speaker, you spell it like ‘Aishwaraya Rai’ in English. However, you might also type ‘Ashwaraya Rai’ (see http://www.ashwaryarai.org/) or ‘Aishvaraya Rai’ and the results of web search will depend on how you spell the name. Now imagine that you could miss something printing that name incorrectly. Now even more – imagine that you do not speak fluent English. Just do not speak English at all. By the way, if you did not know – more than 4 billion people (a lot of people, isn’t it?) do not speak any word of English. Ok, so you do not speak English, but you want any info about Aishwaraya Rai, and in particular images or movies.

You know what? You cannot find it. All search engines track text, but in your case text does not work without language knowledge. Forget about Ms Rai.

However, if you would have an opportunity to find her by your beloved photo which is taken from Chinese magazine, things may change. You submit a photo (without any word of English), and the search engine find all resources containing other photos or videos with Ms Rai casting. Pretty simple, isn’t it?

I think you will have this opportunity very soon as we implement new generation of our system Picollator and enhance the infrastructure.

Internet and www.picollator.com engine capacity

In Uncategorized on February 21, 2008 at 4:04 pm

Very often (yet!) we get comments like ‘I uploaded my photo but picollator did not find me at abcdefgh.com. I know that my photo is there, why did it failed’.
Read the rest of this entry »

Gender recognition

In Uncategorized on February 12, 2008 at 11:23 pm

One of the key issues of all pattern recognition applications is gender recognition. Let us consider the case: you are using the very new pattern recognition and image processing application. You are a male (or female, it does not matter for this discussion, so you may take any side). You submit your photo and the system says you look like several pretty girls, together with your old photos and some of your male relatives. Very often you say: wow, it is not like in real life! The system made a wrong answer, it did not recognize me really, etc. Are you right?
Read the rest of this entry »

This is not a biometric technology

In Uncategorized on February 12, 2008 at 11:21 pm

I would like to make some comments to avoid typical mistakes occurring when I discuss the issues of image based search with other people. The most famous question is about the ratio of proper face identification. Dear friends, please forget about face identification if you are talking about picollator.com . Mention it in airports, police and access control. We are not dealing with special security applications. Read the rest of this entry »

About this blog

In Uncategorized on February 12, 2008 at 11:18 pm

Sorry for the delay I started this post. I am going to respond asap to any comments you may post here, because I am strongly interested in many opinions on how to offer best technology for multimedia search in the Internet.

Just a brief (you may skip it to avoid wasting time on such details): I have started a company www.recogmission.com to make the multimedia search possible without typing letters. The company is not too big at this moment (less than 30 highly experienced gentlemen and ladies), but very active and I hope it will success.

The core question we are attempting to answer is a pretty know issue: it is better to see something once, than listen to its descriptions 100 times. The Internet search is fully depends on texts at this moment. In order to find something, you have to describe it in words. Search engines take your words and just calculate how many times they can be found in some resources, and what is the popularity ratio of those resources. I bet you know about difficulties of the process of finding the information. Too many resources with words and without sense, too complicated process of putting the correct query, too much time to understand the content from the long texts on websites, etc. Another problem is how to find non-text content? The contemporary search engines give you the answer: please provide us the text description and we will find pictures (video, music) described by your words. Again. Textual description. Can you really provide a description of the picture below to identify it in the net?
Sample picture which cannot be described in a few words

The problem is that text is just the formal representation of the World. In real life you are thinking by images, patterns, continuously.
We are developing the system which recognizes multimedia content dividing it into set of objects. At this moment, we process static images at http://www.picollator.com/ . Picollator finds pictures by human faces found in them, because face is one of the most popular objects in multimedia. Finally, 90% around us is about people, isn’t it? :)  We made this quite complex pattern recognition and indexing system in order to demonstrate that words are not enough and that pictures speak by themselves.
So, please provide you comments on this topic if you found it interesting.