This website uses cookies

This website uses cookies to give you the best and most relevant experience. By continuing to browse this site, you are agreeing to our use of cookies. Learn More.

On Demand Property Serialization in Asp.Net Core MVC

This article describes a simple approach to selectively control which properties of a class get serialized to the output and which ones get ignored in an asp.net core mvc (2.2 at the time of writing). This method does not require any alteration in the serialized class. The caller must specify the list of fields to return in the querystring as a comma separated value - if none are specified, all the fields of that class would be returned. The field selection is specified in an ActionFilter by changing the behaviour of the json serializer (Newtonsoft.Json) to ignore fields that are not part of the request querystring and happens relatively late in the overall pipeline.

Read More

Tuning Lucene for Best Results

Recently our search engine was upgraded to Lucene.Net 4.8 (originally 3.3) which presented the right window of opportunity to perform some fine tuning and refactoring. Careful planning is often required to suit the indexing and querying strategies to specific scenarios, assumptions and content (bilingual English and French in our case). Below are a few approaches that worked in our case - an ASP.Net restful search api wrapper around Lucene.Net.

Read More

Lucene.Net's Core Indexing and Search Classes

Lucene exposes a few classes that abstract a rich set of functionality to provide a fairly straightforward interface for implementing indexing and search operations. Understanding what role each of these classes plays is key to effectively leverage and extend Lucene. Usually about five or six indexing and search classes are involved.

Read More

Lucene Analysis Process

Lucene uses analyzers to break down text and extract searchable units known as terms. These terms are the basic building blocks of an index and are used to identify the documents that match queries during search. An analyzer usually consists of a series of tokenizer, stemmer and filter classes, which may be chained into a pipeline so that output from one becomes input for the next. Tokenizers break down data into smaller chunks known as tokens. Stemmers are used to get the base of a word in question which depends on the language used. Filters examine the token stream and decide what to keep, transform and discard. Lucene contains several built-in analyzers which act differently on any given text and generate distinctive output. It's also easy to create custom analyzers if the built-in ones do not meet the requirements of any application.

Read More

The Basics of Information Retrieval using Lucene

Apache Lucene is an open source information retrieval software library that makes it relatively easy to add search functionality to any application or website. It was originally written in Java but has since been ported to several other programming languages including C#, C++ and PHP. It works mainly on textual input, treating them as documents containing fields of text, which allows it to become independent of any file format. Lucene provides several search algorithms and queries that can be customised to address complex search problems. This article gives an overview of information retrieval using an inverted index.

Read More

Using Python and OpenCV for Face Detection From a Live Camera

OpenCV is an open source software library which consists of a comprehensive set of optimised computer vision and machine learning algorithms that can be used to enhance machine perception of the physical world such as face recognition, object identification, human action classification, object movement tracking, image stitching, red eye removal and much more. In this fun experiment, it attempts to identify the faces of people in a live video stream. If it is able to recognise them, it will display the name of the person and a number for the confidence level (lower is better). Otherwise, it would classify them as "intruders".

Read More

Implementing Google reCaptcha V2 in ASP.Net Core using Model Binding and Ajax

Google's reCaptcha is an effective tool for protecting websites against spammy bots. In most cases, valid users only have to click a checkbox to go through easily. By employing an advanced risk analysis mechanism, it provides challenges when the risk level is deemed high enough which discourages bots from engaging further with the website. This article describes the implementation of reCaptcha v2.0 in Asp.Net Core using model binding and ajax calls to retrieve and validate the user's response.

Read More

Removing Text Qualifiers from Numbers in Delimited Files or Strings

Importing delimited data as a bulk operation in Sql Server often requires the use of a format file, especially when the conversion of numeric strings into numeric data types gives rise to errors due to the presence of text qualifiers such as double-quotes. Creating a format file can be a tedious task if the number of columns is high. The approach described in this article looks at the removal of text qualifiers from delimited data and works well if the numeric cells have a non-empty value.

Read More

A QR Code Api Based on a Custom ASP.Net Core Middleware

These are exciting times to be associated with .Net. The introduction of .Net Core is a major rethink of the platform to make it leaner, portable and more flexible. The new modular pipeline - made up of selective middlewares - can be hosted on IIS, its own process or any OWIN based server. This article describes a bare metal approach for creating an api to generate QR codes using a custom middleware without leveraging ASP.NET MVC.

Read More

Generating XML Sitemaps from SQL Server

XML sitemaps, Atom and RSS are three popular xml based formats used to expose website content to interested parties such as crawlers. This article describes the use of T-SQL and xml functionality in SQL Server to generate xml sitemaps directly from a SQL Server database. The same principles can be applied to generate other xml formats such as Atom and RSS.

Read More