CAPTCHAs are widely utilized on the Internet to partially protect against computer attacks, and text-based CAPTCHAs are commonly used. In order to make the more flexible attack, this paper provides a framework with configurable options based on k-NN, including three major parts: preprocessing the binary image, building standard library and recognizing image. The standard library is built from training dataset, where the third part can be an option to drop out some characters with a high similarity, and the library is used for testing dataset. A bit-based similarity model is proposed, where "and" and "or" bit operations are executed, and the result is the ratio of both operations. Finally, the framework is applied into four typical scenarios, MNIST handwriting database, CAPTCHAs built by the CAPTCHA generator, online CAPTCHAs of CNKI website, and CAPTCHAs within open source PHP DedeCMS, the average classification accuracy is 97.05%. As a result, the model is simple but effective, the framework can work well for text-based CAPTCHAs and handwritten numbers, which may make associated websites pay more attention to current authentication mechanism, and it offers flexibility to cover more algorithms and application scenarios by implementing different logics of preprocessing according to defined APIs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.