yara语法

yara

文章目录

    • yara
  • 1. Strings
    • 1.1 Hexadecimal strings
      • wild-cards
      • jump
      • alternatives
    • 1.2 text strings
      • xor
    • full words
    • 1.3 regular expressions
  • 2. condition
    • 2.1 Counting strings
    • 2.2 String offsets or virtual addresses
    • 2.3 Match length
    • 2.4 File size
    • 2.5 Executable entry point
    • 2.6 Accessing data at a given position
    • 2.7 Sets(集合) of strings
    • 2.8 Applying the same condition to many strings
    • 2.9 Using anonymous strings with of and for..of
    • 2.10 Iterating over string occurrences
    • 2.11 Referencing other rules
  • 3. More about rules
    • global rules
    • private rules
    • tags
    • metadata

  • C语言编写,跨平台;
  • 提供python扩展,允许通过python脚本访问搜索引擎;
  • 规则类似c代码,如注释,但变量以美元符开头;
  • 规则由脚本定义和布尔表达式(condition)组成;
  • 该引擎也可以扫描正在运行的进程。

1. Strings

three types of strings in YARA:

  • hexadecimal strings
  • text strings
  • regular expressions

1.1 Hexadecimal strings

三个特殊的结构,使它们更灵活:

  • wild-cards,通配符;
  • jump,跳转;
  • alternatives,替代。

wild-cards

rule WildcardExample
{
    strings:
       $hex_string = { E2 34 ?? C8 A? FB }

    condition:
       $hex_string
}

jump

需要使用可变内容和长度的块来定义字符串时,就要用到jump。

rule JumpExample
{
        strings:
           $hex_string = { F4 23 [4-6] 62 B4 }

        condition:
           $hex_string
}

4 到 6 个字节的任意序列可以占据跳转的位置。

Any jump [X-Y] must meet the condition 0 <= X <= Y.

下面的写法等价:

FE 39 45 [6] 89 00
FE 39 45 [6-6] 89 00
FE 39 45 ?? ?? ?? ?? ?? ?? 89 00

YARA 2.0 开始可以使用无穷:

FE 39 45 [10-] 89 00
FE 39 45 [-] 89 00  //[0-infinite]

alternatives

如果想为十六进制字符串的给定片段提供不同的替代方案,

rule AlternativesExample1
{
    strings:
       $hex_string = { F4 23 ( 62 B4 | 56 ) 45 }

    condition:
       $hex_string
}

F42362B445 or F4235645 会被匹配。

更复杂的也可以:$hex_string = { F4 23 ( 62 B4 | 56 | 45 ?? 67 ) 45 }

1.2 text strings

ascii编码,区分大小写:$text_string = "foobar"

也支持转义字符,如\xdd: Any byte in hexadecimal notation

不区分大小写:$text_string = "foobar" nocase

宽字符B\x00o\x00r\x00l\x00a\x00n\x00d\x00$wide_string = "Borland" wide

但是宽字符只支持英文,不支持utf-16,比如中文。可以使用宽ascii修饰:

$wide_and_ascii_string = "Borland" wide ascii

xor

$xor_string = "This program cannot" xor

The following rule will search for every single byte xor applied to the string “This program cannot”.

equivalent to:

$xor_string_00 = "This program cannot"
$xor_string_01 = "Uihr!qsnfs`l!b`oonu"
$xor_string_02 = "Vjkq\"rpmepco\"acllmv"
// Repeat for every single byte xor

这个修饰符的优先级相比其它修饰符最小。

$xor_string = "This program cannot" xor wide

$xor_string_00 = "T\x00h\x00i\x00s\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00c\x00a\x00n\x00n\x00o\x00t\x00"
$xor_string_01 = "U\x01i\x01h\x01r\x01!\x01q\x01s\x01n\x01f\x01s\x01`\x01l\x01!\x01b\x01`\x01o\x01o\x01n\x01u\x01"
$xor_string_02 = "V\x02j\x02k\x02q\x02\"\x02r\x02p\x02m\x02e\x02p\x02c\x02o\x02\"\x02a\x02c\x02l\x02l\x02m\x02v\x02"
// Repeat for every single byte xor operation.

可以指定范围:xor(0x01-0xff)

full words

match only if it appears in the file delimited by non-alphanumeric characters.

For example the string domain, if defined as fullword, doesn’t match www.mydomain.com but it matches www.my-domain.com and www.domain.com.

1.3 regular expressions

以与文本字符串相同的方式定义,但以正斜杠而不是双引号括起来,就像在 Perl 编程语言中一样。

后面也可以跟nocase, ascii, wide, and fullword

$re1 = /md5: [0-9a-zA-Z]{32}/ 
$re2 = /state: (on|off)/ 

2. condition

其实就是bool表达式。逻辑运算符和python一致。如逻辑与是and

2.1 Counting strings

有时要知道字符串在文件或进程内存中的出现次数。

rule CountExample
{
    strings:
        $a = "dummy1"
        $b = "dummy2"

    condition:
        #a == 6 and #b > 10
}

2.2 String offsets or virtual addresses

sometimes we need to know if the string is at some specific offset on the file or at some virtual address within the process address space. In such situations the operator at is what we need.

rule AtExample
{
    strings:
        $a = "dummy1"
        $b = "dummy2"

    condition:
        $a at 100 and $b at 200
}

true only if string $a is found at offset 100 within the file (or at virtual address 100 if applied to a running process).

100和200是十进制,带上0x就是十六进制。at优先级高于and。

还可以搜索偏移范围:$a in (0..100) and $b in (100..filesize)

还可以使用 @a[i] 获取字符串$a的第 i 次出现的偏移量或虚拟地址。i>=1…。越界则结果为NaN(not a num)

2.3 Match length

/ fo* /可以匹配"fo","foo", "fooo"!a[1]获取匹配长度。

2.4 File size

这是个特殊变量(环境变量?)。

rule FileSizeExample
{
    condition:
       filesize > 200KB
}

2.5 Executable entry point

If the file is a Portable Executable (PE) or Executable and Linkable Format (ELF), this variable entrypoint holds the raw offset of the executable’s entry point in case we are scanning a file. If we are scanning a running process, the entrypoint will hold the virtual address of the main executable’s entry point.

A typical use of this variable is to look for some pattern at the entry point to detect packers(壳) or simple file infectors(感染器).

rule EntryPointExample1
{
    strings:
        $a = { E8 00 00 00 00 }

    condition:
       $a at entrypoint
}

rule EntryPointExample2
{
    strings:
        $a = { 9C 50 66 A1 ?? ?? ?? 00 66 A9 ?? ?? 58 0F 85 }

    condition:
       $a in (entrypoint..entrypoint + 10)
}

If the file is not a PE or ELF, any rule using this variable evaluates to false.

从 YARA 3.0 开始,改为pe模块的pe.entry_point.

2.6 Accessing data at a given position

一些函数可以用于从文件或进程读取数据。

int8()
int16()
int32()

uint8()
uint16()
uint32()

int8be()
int16be()
int32be()

uint8be()
uint16be()
uint32be()

Both 16 and 32 bit integers are considered to be little-endian.If you want to read a big-endian integer use the corresponding function ending in be.

rule IsPE 
{ 
  condition: 
     // MZ signature at offset 0 and ... 
     uint16(0) == 0x5A4D and 
     // ... PE signature at offset stored in MZ header at 0x3C 
     uint32(uint32(0x3C)) == 0x00004550 
} 

2.7 Sets(集合) of strings

把pythonm的in换成of使用。

rule OfExample1
{
    strings:
        $foo1 = "dummy1"
        $foo2 = "dummy2"
        $foo3 = "dummy3"

    condition:
        2 of ($foo1,$foo2,$foo3)
}

集合中的至少两个字符串存在于文件中.

集合元素也可以用通配符指定:2 of ($foo*), 2 of ($*)

all of them       //符合规则的所有字符串 
any of them       // 符合规则的任何字符串 

2.8 Applying the same condition to many strings

对多条字符串应用相同的条件.

其实就是循环遍历,还是把python的in改为of。

for expression of string_set : ( boolean_expression ) 

所有表达式满足boolean_expression,condition才会返回true。

以下等价:

any of ($a,$b,$c) 
for any of ($a,$b,$c) : ( $ ) 

2.9 Using anonymous strings with of and for…of

只用of, for…of时,可以不给变量起名。

rule AnonymousStrings
{
   strings:
       $ = "dummy1"
       $ = "dummy2"

   condition:
       1 of them
}

2.10 Iterating over string occurrences

迭代字符串事件。

刚刚说可以用@a[i]获取偏移,有时需要遍历这些偏移。

rule Occurrences
{
    strings:
        $a = "dummy1"
        $b = "dummy2"

    condition:
        for all i in (1,2,3) : ( @a[i] + 10 == @b[i] )
        //for all i in (1..3)
        //for all i in (1..#a)
}

2.11 Referencing other rules

调用规则

rule Rule1
{
    strings:
        $a = "dummy1"

    condition:
        $a
}

rule Rule2
{
    strings:
        $a = "dummy2"

    condition:
        $a and Rule1
}

3. More about rules

global rules

Want all rules ignoring those files that exceed a certain size limit,

global rule SizeLimit
{
    condition:
        filesize < 2MB
}

private rules

Private rules can serve as building blocks for other rules, and at the same time prevent cluttering YARA’s output with irrelevant information.

private rule PrivateRuleExample{}

tags

rule TagsExample1 : Foo Bar Baz{}

metadata

Their only purpose is to store additional information about the rule.

rule MetadataExample
{
    meta:
        my_identifier_1 = "Some string data"
        my_identifier_2 = 24
        my_identifier_3 = true

    strings:
        $my_text_string = "text here"
        $my_hex_string = { E2 34 A1 C8 23 FB }

    condition:
        $my_text_string or $my_hex_string
}

你可能感兴趣的:(#,virus,杂项)